Lead Systems Engineer
Halvik
About the role
Halvik Corp delivers a wide range of services to 13 executive agencies and 15 independent agencies. Halvik is a highly successful WOB business with more than 50 prime contracts and 500+ professionals delivering Digital Services, Advanced Analytics, Artificial Intelligence/Machine Learning, Cyber Security and Cutting-Edge Technology across the US Government. Be a part of something special!
The Lead Systems Engineer is responsible for leading the design, implementation, and operational support of enterprise‑grade monitoring and observability solutions for cloud‑hosted applications and infrastructure. This role provides technical leadership to a team of engineers while working closely with government customers to translate operational, performance, and availability requirements into reliable monitoring solutions.
The position oversees the administration and integration of platforms such as Splunk, AWS CloudWatch, Azure Monitor, and other application performance monitoring (APM) tools to ensure end‑to‑end visibility across complex cloud environments. Responsibilities include developing dashboards, SLA and performance reports, managing high‑volume data ingestion architectures, and implementing synthetic monitoring capabilities to proactively detect service degradation.
The role also plays a critical part in performance engineering and reliability operations by analyzing load and regression test results, conducting Java/JVM performance analysis, and leading incident triage efforts for AWS‑hosted applications. By combining deep technical expertise with customer engagement and team leadership, this position ensures high availability, performance, and reliability of mission‑critical systems.
Core Responsibilities:
- Strategic Execution: Executes cloud observability and reliability strategy by leading engineering teams, translating customer requirements into scalable monitoring solutions, and operationalizing performance, availability, and SLA objectives through Splunk and cloud‑native services.
- Operational Oversight: Provides operational oversight of cloud monitoring and observability platforms, ensuring SLA compliance, proactive issue detection, effective incident response, and continuous performance optimization for AWS‑hosted applications.
- Collaboration: Collaborates with government customers, engineering teams, and stakeholders to translate monitoring requirements into scalable solutions, coordinate incident response, and drive continuous improvement in cloud reliability and performance.
- Technical Performance: Ensures optimal application and infrastructure performance through advanced monitoring, JVM instrumentation, load and regression analysis, and Splunk‑based event analytics. Proactively identifies, analyzes, and resolves performance and reliability issues in AWS cloud environments to meet SLA and availability targets.
Minimum Requirements:
- Education: Bachelor's Degree in Computer Science with 8 years' experience or Master's Degree with 5 years of experience
- Experience: 8+ years of IT experience with 5+ years specializing in Application Performance Management, cloud monitoring, and performance engineering for complex, multi‑tier applications. Strong expertise in synthetic monitoring, Java/JVM performance analysis, and automation using Splunk, cloud‑native tools, and scripting.
- Technical Proficiency: Application Performance Management, synthetic availability monitoring, and performance analysis across complex, multi‑tier systems using Java/JVM, cloud and APM tools. Strong scripting and automation skills (Shell, PowerShell, Regex) with expertise in performance tuning, monitoring, and data‑driven troubleshooting.
- Compliance: Ensures monitoring and observability solutions adhere to government, security, and operational compliance requirements while meeting defined SLAs and availability standards. Maintains compliant configurations, reporting, and data handling practices across cloud monitoring platforms and performance engineering processes.
- Certifications: certifications in APM/observability (Splunk), cloud platforms (AWS/Azure)
Preferred Expertise:
- Strong expertise in AWS cloud monitoring and Splunk observability platforms, with experience supporting government environments (USPTO preferred), combined with proven technical leadership and strong communication skills for customer‑facing collaboration.
- Strong analytical and problem-solving capabilities.
Halvik offers a competitive full benefits package including:
- Company-supported medical, dental, vision, life, STD, and LTD insurance
- Benefits include 11 federal holidays and PTO
- Eligible employees may receive performance-based incentives in recognition of individual and/or team achievements.
- 401(k) with company matching
- Flexible Spending Accounts for commuter, medical, and dependent care expenses
- Tuition Assistance
- Charitable Contribution matching
Halvik Corp is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability or veteran status.
Halvik's pay range for this job level is a general guideline only and not a guarantee of compensation or salary. Additional factors considered in extending an offer include (but are not limited to) responsibilities of the job, education, experience, knowledge, skills, and abilities, as well as internal equity, alignment with market data, applicable bargaining agreement (if any), or other law.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free