Skip to content
mimi

Distributed Systems Engineer

Oracle

Montpelier · On-site Full-time Lead 3w ago

About the role

About

At OCI, we are pioneering the development of the world's largest AI clusters and accelerating their delivery to the market. The AI Infrastructure team is at the forefront of this initiative, creating a GPU-focused cloud that maximizes performance, efficiency, reliability, and scalability. Join us in the AI revolution by building systems that empower customers to scale from tens to thousands of GPUs seamlessly, without sacrificing performance. You’ll have the chance to work with cutting-edge technologies and have a direct impact on our organization’s success.

We are seeking an exceptional distributed systems engineer to enhance and optimize AI infrastructure components, such as the GPU control plane and data plane that deliver computing resources for customer AI workloads. In this role, you will provide technical leadership, clarify complex issues, and devise innovative solutions. You'll collaborate with cross-functional teams to improve our AI infrastructure, ensuring an outstanding customer experience and optimal performance.

Responsibilities

  • Design and develop scalable and optimized solutions for AI compute infrastructure components like the GPU control and data planes, focusing on enhancing customer experience and workload performance.

Skills

GPU

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free