Skip to content
mimi

Sr Infra Devops Engineer (Toronto, ON-Hybrid)

TestingXperts

Toronto · On-site Contract Senior Yesterday

About the role

Title: Sr. Infra Devops engineer

Duration: 6+ months

Location: Toronto, ON(Hybrid)

Job Description:

Platform Infrastructure, CI/CD, EKS Operations, IaC & Cloud Cost Management

ROLE OVERVIEW & KEY RESPONSIBILITIES:

• Infrastructure Operations & On-Call Own on-call rotation for infrastructure-layer incidents; manage EKS

cluster health, node scaling, networking, and availability; perform RCAs for infra failures.

• CI/CD Pipeline Management Operate and maintain GitHub Actions pipelines; manage Argo CD GitOps

deployments across dev, QA, and production; handle pipeline failures and improve reliability.

• DORA Metrics — Infrastructure Lens Track Lead Time for Changes and Deployment Frequency at the

infrastructure level; identify pipeline bottlenecks and drive continuous improvement.

• Infrastructure as Code (IaC) Write and maintain OpenTofu/Terraform scripts for AWS infrastructure

provisioning; manage EKS, VPC, IAM roles, S3, RDS, and networking configurations.

• Kong API Gateway Operations Administer Kong instances (K8s-deployed); manage plugins, routing policies,

rate limits, JWT auth configuration, and gateway health monitoring.

• Security & Compliance Operations Manage IAM roles and service account roles (SAR); rotate credentials and

secrets; ensure SSDLC compliance for all infra changes; coordinate security reviews.

• Cost & Capacity Management Monitor AWS spend; identify and act on cost optimization opportunities;

manage resource right-sizing; report on infrastructure cost per service.

• Artifactory & Tooling Operate Artifactory for image and artifact management; manage registry access

controls; ensure pipeline dependencies are pinned and auditable.

REQUIRED SKILLS & EXPERIENCE:

AWS Infrastructure (Strong)

• 5+ years of hands-on AWS experience: EKS, EC2, VPC, IAM, S3, RDS, CloudWatch, Route53

• Strong Kubernetes administration: cluster setup, node groups, namespaces, RBAC, Helm charts

• Experience with AWS networking: VPC design, subnets, NAT gateways, security groups, peering

• Familiarity with AWS cost management tools and FinOps practices

DevOps & CI/CD (Core Competency)

• Deep experience with GitHub Actions or equivalent CI/CD platforms

• Hands-on Argo CD or Flux GitOps — deployment strategies, rollback, progressive delivery

• Container image management: Docker, Artifactory or ECR, image scanning

• Experience with secret management: HashiCorp Vault, AWS Secrets Manager, or equivalent

Infrastructure as Code

• Proficient with Terraform or OpenTofu — modules, state management, remote backends

• Experience writing IaC for EKS, VPC, and IAM from scratch

• Familiarity with Helm chart authoring and management

Operational Excellence (Core Competency)

• DORA metrics tracking at the infrastructure and pipeline level

• Experience running on-call rotations with structured incident management

• Runbook authoring for infrastructure failure modes

• SRE principles: error budgets, toil reduction, reliability engineering

Nice to Have:

• Kong API Gateway — plugin configuration, deck/declarative config, admin API

• OpenStack experience (reference: existing Kong test instance)

• Multi-cloud exposure (GCP or Azure) alongside AWS primary

• Familiarity with Langfuse, Temporal, or data pipeline infrastructure

OpEx Ownership: This role owns Lead Time for Changes and pipeline reliability metrics. Target: pipeline success rate > 95%; infrastructure incident MTTR < 1 hour; zero unplanned infra outages per sprint. Owns monthly cloud

cost report.

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free