Lead Cloud Engineer, SRE – XP +5
Tricefal
About the role
About
Do you have a passion for technology and DevOps practices? Do you want to make a difference by developing and managing cloud infrastructure used by some of the biggest companies in the world and a critical player in the market infrastructure?
We are looking for a skilled Lead Cloud Engineer with VMware, AWS, Azure, and Google experience to join the Digital Asset Clear development team. This role is vital in designing, deploying, and maintaining resilient, scalable, and highly available cloud infrastructure. Collaboration with development teams, operations, and partners will be crucial in ensuring efficient performance and availability of our critical environments.
Key Responsibilities and Accountabilities
- Implements, tunes, and conducts ongoing administration of infrastructure layer, including proposing application systems changes, better uses and improvements.
- Improves the whole lifecycle of services from inception and design, through deployment, operation, and refinement.
- Leads support services before they launch through activities such as system design consulting, developing features of the infrastructure platforms and frameworks, capacity planning, and launch reviews.
- Provides mentorship to other team members on handling availability and performance of critical services, on building automation to prevent problem recurrence, and on building automated responses for non-exceptional service conditions.
- Maintains services once they are live by measuring and monitoring availability, latency, and overall system health.
- Leads sustainable and efficient incident response.
- Scales systems sustainably through mechanisms like automation, evolving evolve systems by advocating for changes that improve reliability and velocity.
- Writes highly optimized and accurate code for LCH SA and LSEG products and solutions.
- Proactively continues to build and apply relevant domain knowledge that may relate to workflows, data pipelines, business policies, configurations and constraints.
- Supports essential processes while ensuring high quality standards are met.
What You'll Be Doing
- Design, deploy, and maintain highly available and scalable infrastructure solutions to support our critical applications and services. This will incorporate security and compliance standards, with an understanding of various range of IaaS or PaaS services (compute, storage, database, security services).
- Work closely with the existing development team and provide your expertise to mentor the team members.
- Build operational tools for deployment, monitoring, and analysis of critical infrastructure. Develop and improve automation tools, scripts, and frameworks to streamline administration tasks, improve efficiency, and reduce manual effort. Build and maintain monitoring solutions to proactively identify and resolve performance issues.
- Automate deployment, configuration, and maintenance processes using infrastructure-as-code (IaC) tools and technologies.
- Collaborate with infrastructure teams to estimate resource requirements, plan for future growth, and ensure infrastructure scalability to meet evolving business needs.
- Design and implement robust backup and recovery strategies for various services, ensuring integrity and quick recovery in case of failures or disasters.
- Participate in incident management activities, including root cause analysis, mitigation, and resolution of related incidents.
- Embed into cloud projects and on-call rotations to keep your skills sharp and stay close to the operational workflows and issues.
- Work on systems: edge cases, failure modes, behaviors, specific implementations.
You’ll Bring
- Bachelor’s degree in computer science, Engineering, or a related field (or equivalent experience).
- Experience (at least 5 years) as a Cloud Engineer, SRE or similar role, with a focus on managing and maintaining Cloud infrastructure in regulated and or highly available environment.
- Expertise in cloud AWS, Azure, Google platforms (at least 3 years).
- Expertise (at least 3 years) in scripting languages (Python, Bash), and on automation/configuration management tools (Ansible, Puppet, Chef).
- Expertise in containerization technologies (Docker, Kubernetes).
- Experience in Linux/Unix systems (at least 5 years).
- Experience on VMware technology (at least 2 years).
- Experience on operat
Requirements
- Bachelor’s degree in computer science, Engineering, or a related field (or equivalent experience).
- Experience (at least 5 years) as a Cloud Engineer, SRE or similar role, with a focus on managing and maintaining Cloud infrastructure in regulated and or highly available environment.
- Expertise in cloud AWS, Azure, Google platforms (at least 3 years)
- Expertise (at least 3 years) in scripting languages (Python, Bash), and on automation/configuration management tools (Ansible, Puppet, Chef)
- Expertise in containerization technologies (Docker, Kubernetes)
- Experience in Linux/Unix systems (at least 5 years)
- Experience on VMware technology (at least 2 years)
Responsibilities
- Implements, tunes, and conducts ongoing administration of infrastructure layer, including proposing application systems changes, better uses and improvements.
- Improves the whole lifecycle of services from inception and design, through deployment, operation, and refinement.
- Leads support services before they launch through activities such as system design consulting, developing features of the infrastructure platforms and frameworks, capacity planning, and launch reviews.
- Provides mentorship to other team members on handling availability and performance of critical services, on building automation to prevent problem recurrence, and on building automated responses for non-exceptional service conditions.
- Maintains services once they are live by measuring and monitoring availability, latency, and overall system health.
- Leads sustainable and efficient incident response.
- Scales systems sustainably through mechanisms like automation, evolving evolve systems by advocating for changes that improve reliability and velocity.
- Writes highly optimized and accurate code for LCH SA and LSEG products and solutions.
- Proactively continues to build and apply relevant domain knowledge that may relate to workflows, data pipelines, business policies, configurations and constraints.
- Supports essential processes while ensuring high quality standards are met.
- Design, deploy, and maintain highly available and scalable infrastructure solutions to support our critical applications and services. This will incorporate security and compliance standards, with an understanding of various range of IaaS or PaaS services (compute, storage, database, security services).
- Work closely with the existing development team and provide your expertise to mentor the team members.
- Build operational tools for deployment, monitoring, and analysis of critical infrastructure.
- Develop and improve automation tools, scripts, and frameworks to streamline administration tasks, improve efficiency, and reduce manual effort.
- Build and maintain monitoring solutions to proactively identify and resolve performance issues.
- Automate deployment, configuration, and maintenance processes using infrastructure-as-code (IaC) tools and technologies.
- Collaborate with infrastructure teams to estimate resource requirements, plan for future growth, and ensure infrastructure scalability to meet evolving business needs.
- Design and implement robust backup and recovery strategies for various services, ensuring integrity and quick recovery in case of failures or disasters.
- Participate in incident management activities, including root cause analysis, mitigation, and resolution of related incidents.
- Embed into cloud projects and on-call rotations to keep your skills sharp and stay close to the operational workflows and issues.
- Work on systems: edge cases, failure modes, behaviors, specific implementations.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free