Site Reliability Engineer for AI Platforms
HRB
About the role
About
Take on the challenge as a Site Reliability Engineer (SRE) within an innovative tech landscape. Drive reliability and scalability of AI-based platforms while enhancing engineering tools with a focus on productivity and system performance.
This role at our client’s company emphasizes the establishment of platform strategies and best practices. You’ll improve CI/CD operations, ensure operational excellence, and implement automation for deployment and monitoring. Collaboration across teams is vital as you develop internal tools to optimize productivity while adhering to security compliance standards, including GDPR and SoC.
Key Responsibilities
- Develop scalable infrastructure to meet engineering needs
- Enhance continuous integration and deployment processes
- Streamline release tracking and automate deployments
- Ensure high availability through observability implementations
- Create tools improving efficiency for developers
Requirements
- 7+ years of relevant experience, 5+ as SRE
- Extensive knowledge of AWS services and Kubernetes
- Proven track record with CI/CD tools, primarily AWS CDK
- Familiarity with monitoring solutions like Prometheus
- Understanding of security compliance frameworks
Transform your engineering career by optimizing real-world technological solutions.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free