Skip to content
mimi

Cloud Engineer – Observability and SRE

Akkodis

San Francisco · Hybrid Contract Senior $58 – $62/hr 1w ago

About the role

Position Summary

The Grade 10 Cloud Engineer within the Customer’s Cloud Collaboration Technology Group will play a key role in building and operating scalable observability and infrastructure platforms supporting Webex microservices. This role requires strong hands-on expertise in Kubernetes, cloud infrastructure, and observability systems, along with the ability to operate independently and to own components end-to-end in production environments. Candidates will demonstrate extensive use of generative AI tools for code generation and production system troubleshooting.

Key Responsibilities

  • Design, develop, and operate observability platforms – to perform logging, metrics, and/or tracing – for Webex microservices.
  • Manage and optimize Kubernetes clusters across multi-region environments.
  • Own CI/CD pipelines using Argo CD and Helm.
  • Implement Infrastructure as code (IAC) using Terraform on AWS.
  • Operate monitoring ecosystems, including but not limited to: OpenSearch/ELK, Prometheus, Grafana, Splunk, and Kafka.
  • Build automation to detect and remediate production issues.
  • Ensure security compliance through vulnerability patching.
  • Collaborate cross-functionally to improve reliability.
  • Participate in on-call rotations and incident response.
  • Contribute to distributed system design and operations.

Required Education

  • Bachelor’s degree in computer science or related field.

General Technical Skills

  • At least 8 years of experience in a DevOps and/or SRE platform engineering role
  • Incident response and on-call operations: Demonstrated experience in a 24/7 production environment, including but not limited to:
    • Triaging alerts
    • Leading incident response
    • Writing post-incident reviews
    • Maintaining SLA commitments across large-scale distributed systems
  • IaC and automation: Proficiency with Terraform, Ansible, and/or equivalent IaC tooling for provisioning and managing cloud infrastructure at scale on AWS
  • Scripting and development: Working proficiency in Python, Golang, and/or Bash for building automation scripts, operational tooling, and/or CI/CD pipeline integrations (e.g., Drone, GitHub Actions, Argo CD).

Specific Technical Skills

  • Kubernetes and container orchestration: Production experience operating and troubleshooting workloads on Kubernetes at large scale (i.e., hundreds of deployments and thousands of pods), including but not limited to:
    • Helm chart management
    • Pod scheduling
    • Resource tuning
    • Multi-cluster operations
  • Observability stack expertise: Hands-on experience – performing pipeline design, query optimization, and/or capacity planning for high-volume environments – in at least two (2) of the following:
    • OpenSearch/Elasticsearch
    • Prometheus/Mimir
    • Grafana
    • Loki
    • Splunk
    • Logstash

Desired Skills

  • Apache Kafka/AWS MSK: Experience in at least one (1) of the following:
    • Operating or tuning Kafka clusters at scale
    • Managing the following across high-throughput streaming pipelines:
      • Topic configurations,
      • ACLs,
      • Consumer lag, and/or
      • Schema registries

Benefit Offerings

Benefit offerings available for our associates include medical, dental, vision, life insurance, short-term disability, additional voluntary benefits, an EAP program, commuter benefits, and a 401K plan. Our benefit offerings provide employees the flexibility to choose the type of coverage that meets their individual needs. In addition, our associates may be eligible for paid leave, including Paid Sick Leave or any other paid leave required by Federal, State, or local law, as well as Holiday pay where applicable. Disclaimer: These benefit offerings do not apply to client-recruited jobs and jobs that are direct hires to a client.

Skills

AnsibleArgo CDAWSBashCI/CDDockerELKGrafanaGolangHelmInfrastructure as CodeKafkaKubernetesLogstashMimirOpenSearchPrometheusPythonSplunkTerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free