Distributed Systems Engineer
Oracle
About the role
About
At OCI, we are pioneering the development of the world's largest AI clusters and accelerating their delivery to the market. The AI Infrastructure team is at the forefront of this initiative, creating a GPU-focused cloud that maximizes performance, efficiency, reliability, and scalability. Join us in the AI revolution by building systems that empower customers to scale from tens to thousands of GPUs seamlessly, without sacrificing performance. You’ll have the chance to work with cutting-edge technologies and have a direct impact on our organization’s success.
We are seeking an exceptional distributed systems engineer to enhance and optimize AI infrastructure components, such as the GPU control plane and data plane that deliver computing resources for customer AI workloads. In this role, you will provide technical leadership, clarify complex issues, and devise innovative solutions. You'll collaborate with cross-functional teams to improve our AI infrastructure, ensuring an outstanding customer experience and optimal performance.
Responsibilities
- Design and develop scalable and optimized solutions for AI compute infrastructure components like the GPU control and data planes, focusing on enhancing customer experience and workload performance.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free