Skip to content
mimi

Lead Associate Principal, Cloud Engineering

New York Technology Partners

Chicago · Hybrid Full-time Lead $170k – $180k/yr Today

About the role

Responsibilities

To perform this job successfully, an individual must be able to perform each primary duty satisfactorily.

  • Reports to the Director of Platform Automation and Cloud Engineering
  • Design, configure, implement and manage Kubernetes clusters and maintain a fully automated workflow for provisioning and managing a complex, highly available container orchestration environment using infrastructure as code
  • Develop and maintain Kubernetes operators, controllers, and custom resources to extend cluster functionality and automate application lifecycle management
  • Manage DevOps development activities and complex development tasks that will involve working with tools such as Docker, Kafka, container runtimes, and Kubernetes ecosystem tools
  • Lead and participate in Kubernetes cluster build-outs, upgrades, software installation, maintenance and support, including but not limited to, patches, security fixes, end-of-life preparation, and version upgrades
  • Implement and manage Kubernetes networking solutions, service mesh architectures, runtime security policies, and RBAC configurations to ensure secure and efficient cluster operations
  • Ensure the reliability of Kubernetes platforms and containerized services your area of responsibility provide and manage to both specific and implied SLAs to help the organization achieve both internal and external quality standard excellence for the cloud platform
  • Assess and plan for capacity needs within Kubernetes clusters and the underlying cloud platform and forecast accordingly
  • Implement and manage initiatives within your assigned area of responsibility with accountability for results and compliance with all controls and security requirements
  • Lead in the development of technology roadmaps and end-of-life technology plans for Kubernetes versions, container runtimes, and related cloud-native technologies
  • Write and maintain documentation of relevant Kubernetes architectures, systems, procedures and processes
  • Effectively communicate project and operational service issues to senior management promptly with observations, decisions, and recommendations for corrective measures
  • Manage and participate in the implementation of production changes during defined maintenance windows and support on call rotation
  • Maintain appropriate work/personal balance within your team
  • Serve as a point of escalation within the team for Kubernetes and containerization support issues
  • Implement and manage rotational support schedules for afterhours and weekend work for area of responsibility
  • Foster an atmosphere of trust, respect, and high performance while displaying strong ethics and integrity
  • Manage project and daily task planning and prioritization and meeting project deadlines while also maintaining a high quality of work
  • Institutes corrective actions to address audit and other regulatory or compliance findings
  • Operate within budget; Establish and assure adherence to schedules, work plans, and performance requirements
  • Other duties as assigned

Qualifications & Experience

The requirements listed are representative of the knowledge, skill, and/or ability required. Reasonable accommodations may be made to enable individuals with disabilities to perform the primary functions.

  • [Required] Good consultative, communication, team player and analytical skills are a must, as you will be regularly interacting between various teams distributed across the US
  • [Required] Working knowledge of Kubernetes architecture, container orchestration, and cloud-native infrastructure design and components, such as: etcd, networking, storage, and container runtimes
  • [Required] Extensive hands-on experience with Kubernetes cluster creation, maintenance, support, and administration in production environments
  • [Required] Deep understanding and practical implementation experience with Kubernetes networking (CNI plugins, service types, ingress controllers), runtime security (Pod Security Standards, OPA/Gatekeeper, network policies), and Role-Based Access Control (RBAC)
  • [Required] Experience with architecting, implementing and maintaining highly available mission critical Kubernetes environments for 24/7 availability
  • [Required] Experience working in an environment with a defined production change control process
  • [Required] Demonstrates history of working within deadlines and ability to work well under pressure

Technical Skills & Background

  • [Required] Production-level hands-on experience with AWS cloud services and implementing Kubernetes on AWS (EKS or self-managed clusters)
  • [Required] Extensive experience with Infrastructure as Code using Terraform for provisioning and managing cloud infrastructure and Kubernetes resources
  • [Required] Strong hands-on development skills with demonstrable coding experience in Go or Python (Go strongly preferred for Kubernetes operator/controller development). Candidates must be able to provide specific examples of production code they have written.
  • [Required] Hands-on experience with Kubernetes ecosystem tools including: Helm, kubectl, container runtimes (containerd, CRI-O), and monitoring/observability tools
  • [Required] Experience with CI/CD tools such as Jenkins, GitLab CI, or GitHub Actions.
  • [Required] Experience with version control using GitHub or similar platforms
  • [Required] Experience with configuration management tools such as Ansible, Puppet, or Chef
  • [Strongly Preferred] Hands-on experience with Kubernetes operator/controller development using operator frameworks (Kubebuilder, Operator SDK, or similar). This can be demonstrated through either contributions to open-source Cloud Native Computing Foundation (CNCF) projects, OR Development of in-house Kubernetes operators/controllers. Note: If you have contributed to open-source CNCF projects, please include your GitHub profile link or links to notable Pull Requests in your resume.
  • [Preferred] Experience with Rancher and RKE2 (Rancher Kubernetes Engine 2) Kubernetes distribution
  • [Preferred] Experience with service mesh technologies (Istio, Linkerd) and Envoy proxy configuration and management
  • [Preferred] Experience designing and implementing multi-tenancy architectures in Kubernetes environments
  • [Preferred] Experience with GitOps-based continuous deployment using FluxCD, ArgoCD, or Rancher Fleet
  • [Preferred] Experience with Kafka and event-driven architectures

Certifications

  • [Preferred] CKA, CKS certifications strongly desired
  • [Preferred] AWS Solutions Architect Associate Certification or higher
  • [Preferred] Relevant industry certifications such as Microsoft Azure or Google Cloud Platform

Education & Training

  • [Required] Bachelor's degree, preferably in a technical discipline (Computer Science, Mathematics, Engineering, etc.), or equivalent combination of education and experience required
  • [Required] 7+ years’ experience in IT systems installation, operations, administration, and maintenance of cloud systems / virtualized servers, with demonstrated significant experience in Kubernetes and container orchestration platforms
  • [Preferred] Experience working in a financial services or highly regulated environment preferred

Skills

AnsibleAWSAWS EKSChefCI/CDcontainerdCRi-ODockerEnvoyFluxCDGitLab CIGitOpsGoHelmIstioJenkinsKafkaKubernetesLinkerdPuppetPythonRancherRKE2Terraform

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free