Senior Lead Site Reliability Engineer
EPAM Systems Inc
About the role
Join our dynamic team as a Senior Lead Site Reliability Engineer focused on enhancing system reliability, observability, and performance monitoring for essential digital trading products.
In this pivotal role, you will spearhead monitoring initiatives within a high-availability trading environment, ensuring seamless connectivity with external partners while actively identifying opportunities for continuous improvement. At EPAM, you will engage with cutting-edge technologies, tackle complex challenges, and play a critical role in the evolution of digital innovation. We offer a supportive environment with extensive learning opportunities, mentorship, and access to global projects, empowering you to create significant change.
The recruiting efforts for this position are aimed at filling an urgent vacancy.
Key Responsibilities
- Develop and implement a strategic reliability vision for the trading portfolio, encompassing infrastructure, network connectivity, application performance, and throughput.
- Lead and guide a team of Site Reliability Engineers, providing technical leadership, mentorship, and performance feedback.
- Own and enhance the SLA/SLO/SLI framework, which includes error budgets and service health reporting.
- Configure and optimize thorough monitoring and alerting systems across infrastructure and applications.
- Drive observability best practices utilizing APM and monitoring platforms (e.g., Dynatrace).
- Analyze application and infrastructure performance to pinpoint fault domains and uncover root causes of critical incidents.
- Oversee major incident management, coordinate resolution efforts, and facilitate blameless postmortems.
- Participate in 24x7x365 support rotation to ensure operational excellence within the team.
- Identify automation opportunities to enhance reliability, scalability, and operational efficiency.
Required Qualifications
- 8+ years of experience in Site Reliability Engineering, DevOps, or Production Engineering.
- Demonstrated leadership experience (technical lead or team lead), with the ability to mentor engineers effectively.
- Strong hands-on experience with SLA/SLO/SLI definition, governance, and reporting.
- Solid experience in Microsoft Azure environments (IaaS, PaaS, networking, monitoring).
- Hands-on experience with Dynatrace (configuration, alerting, dashboards, performance analysis).
- Familiarity with observability, monitoring, and APM tools in production settings.
- Ability to perform efficiently under pressure in time-sensitive, high-impact environments.
What We Offer
- Medical, Dental, and Vision Insurance (Subsidized).
- Health Savings Account.
- Flexible Spending Accounts (Healthcare, Dependent Care, Commuter).
- Short-Term and Long-Term Disability (Company Provided).
- Life and AD&D Insurance (Company Provided).
- Employee Assistance Program.
- Unlimited access to LinkedIn learning solutions.
- Matched 401(k) Retirement Savings Plan.
- Paid Time Off - accrue 15-25 paid days, depending on level and tenure with EPAM.
- Paid Holidays - nine (9) total per year.
- Legal Plan and Identity Theft Protection.
- Accident Insurance.
- Employee Discounts.
- Pet Insurance.
- Employee Stock Purchase Program.
- If eligible, participation in the annual discretionary bonus program.
- If eligible and hired into a qualifying level, participation in the discretionary Long-Term Incentive (LTI) Program.
This Remote Position Cannot be Performed in New York City.
This posting showcases a good faith range of the salary EPAM anticipates paying the selected candidate. The range reflects base salary only, with individual offers based on various factors such as geographic location, experience, credentials, education, training, demand for the role, and overall business considerations. Most candidates are offered salaries within the disclosed range. Salary range: $140,000 - $155,000. Additionally, the details in this job posting outline the expected benefits and compensation for the position.
In accordance with the LA County Fair Chance Ordinance, a summary of the ordinance’s key provisions is available here: Concept FCO Posting 8 27 24 (lacounty.gov)
EPAM Systems, Inc. promotes a diverse and inclusive work environment. We are committed to recruiting, hiring, developing, and advancing employees without discrimination. This commitment encompasses compliance with all applicable laws in the countries where we operate, while also recognizing the importance of equal opportunity and inclusion to foster success.
At EPAM, we ensure employment actions are based on individual qualifications, without regard to race, color, religion, creed, gender, pregnancy status, sexual orientation, gender identity, gender expression, marital or familial status, national origin, ancestry, genetics, age, disability status, veteran status, citizenship status (when otherwise legally able to work), or any other characteristic protected by law.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free