Remote Site Reliability Engineer(DevOps) (Noida)
Recognized
About the role
Position
We’re looking for a Senior Devops Engineer to join our Site Reliability Engineer (SRE) Team in Noida.
About Taazaa
Working at Taazaa involves engaging with cutting-edge technology and innovative software solutions in a collaborative environment. We emphasize on continuous professional growth, offering workshops and training. Our employees often interact with clients to tailor solutions to business needs, working on diverse projects across industries. We promote work-life balance with flexible hours and remote options, fostering a supportive and inclusive culture. Competitive salaries, health benefits, and various perks further enhance the work experience.
Role Overview
We plan to deepen our expertise in AI and machine learning, enhance our cloud services, and continue fostering a culture of innovation and excellence. As a Site Reliability Engineer (SRE) you will play a pivotal role in the design, implementation, and maintenance of the infrastructure that supports our software development lifecycle. You will work closely with software engineers, QA, and IT teams to ensure the availability, reliability, and performance of our systems. Your primary focus will be on streamlining our deployment processes, improving system scalability, and ensuring a robust, secure, and cost‑efficient infrastructure.
Responsibilities
- Partner with product engineering squads to design, build, and operate highly reliable services
- Participate in on‑call rotation and drive rapid, effective resolution of production issues
- Collaborate closely with Staff Engineers / Team Leads to:
- Embed reliability best practices into the development lifecycle
- Mentor engineers on operational excellence, observability, and on‑call mindset
- Champion modern engineering and DevOps practices:
- Effective use of AI‑assisted tools to accelerate scripting, debugging, and documentation
- Proactively identify and eliminate classes of failure through chaos engineering, capacity planning, and performance tuning
- Help evolve our technical strategy for reliability, scalability, and cost‑efficiency
Requirements
- 5+ years of skilled experience in SRE, DevOps, or software engineering with a strong focus on production systems
- Deep hands‑on experience operating distributed cloud systems (AWS / GCP / Azure — at least one in depth, preferably AWS)
- Proficiency in at least one modern programming language used for tooling & automation (Go, Python, TypeScript/JavaScript, Rust)
- Building dashboards and alerts (Grafana, Groundcover, Datadog, New Relic, Prometheus, etc.)
- Experience defining and working with SLOs, SLIs, and error budgets
- Comfort with infrastructure as code and modern DevOps practices (CI/CD, GitOps, containers/Kubernetes)
- Excellent collaboration skills — you enjoy partnering with product engineers and teaching reliability concepts
- Bias toward automation and reducing manual toil
- Previous on‑call leadership or incident commander experience
- Background in performance engineering or capacity planning at scale
- Familiarity with service meshes, API gateways, or zero‑trust networking
- Experience mentoring or embedding within product squads within product squads
Soft Skills
- Collaboration and Teamwork: Work well within a team, encouraging collaboration and valuing diverse perspectives to achieve common goals and deliver high‑quality results.
- Adaptability and Flexibility: Stay adaptable in a fast‑paced, dynamic workplace, effectively managing changing priorities and requirements while maintaining focus on project objectives.
Benefits
Joining Taazaa Tech means thriving in a dynamic, innovative environment with competitive compensation and performance‑based incentives. You'll have ample opportunities for professional growth through workshops and certifications, while enjoying a flexible work‑life balance with remote options. Our collaborative culture fosters creativity and exposes you to diverse projects across various industries. We offer clear career advancement pathways, comprehensive health benefits, and perks like team‑building activities.
Company Vision
Taazaa Tech is a kaleidoscope of innovation, where every idea is a brushstroke on the canvas of tomorrow. It's a symphony of talent, where creativity dances with technology to orchestrate solutions beyond imagination. In this vibrant ecosystem, challenges are sparks igniting the flames of innovation, propelling us towards new horizons.
Requirements
- 5+ years of experience in SRE, DevOps, or software engineering with a focus on production systems
- Hands‑on experience operating distributed cloud systems (AWS, GCP, or Azure; at least one in depth)
- Proficiency in at least one modern programming language used for tooling & automation (Go, Python, TypeScript/JavaScript, Rust)
- Experience building dashboards and alerts (Grafana, Groundcover, Datadog, New Relic, Prometheus, etc.)
- Experience defining and working with SLOs, SLIs, and error budgets
- Comfort with infrastructure as code and modern DevOps practices (CI/CD, GitOps, containers/Kubernetes)
- Strong collaboration and communication skills
- Bias toward automation and reducing manual toil
- Previous on‑call leadership or incident commander experience
- Background in performance engineering or capacity planning at scale
- Familiarity with service meshes, API gateways, or zero‑trust networking
- Experience mentoring or embedding within product squads
Responsibilities
- Partner with product engineering squads to design, build, and operate highly reliable services
- Participate in on-call rotation and drive rapid, effective resolution of production issues
- Embed reliability best practices into the development lifecycle
- Mentor engineers on operational excellence, observability, and on-call mindset
- Champion modern engineering and DevOps practices, including AI‑assisted tools for scripting, debugging, and documentation
- Identify and eliminate classes of failure through chaos engineering, capacity planning, and performance tuning
- Help evolve technical strategy for reliability, scalability, and cost‑efficiency
- Build dashboards and alerts using tools such as Grafana, Groundcover, Datadog, New Relic, Prometheus, etc.
- Define and work with SLOs, SLIs, and error budgets
- Implement infrastructure as code and modern DevOps practices (CI/CD, GitOps, containers/Kubernetes)
- Collaborate closely with product engineers and teach reliability concepts
- Lead on‑call rotation or act as incident commander
- Perform performance engineering and capacity planning at scale
- Work with service meshes, API gateways, and zero‑trust networking
Benefits
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free