Skip to content
mimi

Site Reliability Engineer

Pacer Group

Longueuil · On-site Full-time Senior 1w ago

About the role

About

Requirements

  • 7-8 years of experience in SRE / Infrastructure / ops for large-scale systems
  • Experience in supporting IaaS platforms
  • Exp. in infrastructure supporting GenAI applications
  • Should have strong programming/scripting skills (Python, Go, Java)
  • Experience with containerization (Docker) and orchestration (Kubernetes, etc.) tools
  • Exp. with IaC (Terraform, Helm, CloudFormation, Ansible, etc.)
  • Knowledge of GPU / AI compute clusters
  • Exp. with monitoring/ alerting tools (Prometheus, Grafana, ELK / EFK, Datadog, etc.)
  • Networking & systems engineering knowledge (TCP/IP, DNS, routing, load balancing, distributed storage)

Skills

AnsibleCloudFormationDockerDatadogELKEFKGrafanaGoHelmIaaSIaCJavaKubernetesPrometheusPythonSRETerraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free