Senior AI Engineer, Machine Learning Operations
WhatJobs Direct
About the role
Our client is at the forefront of AI innovation, developing solutions that are revolutionizing industries. We are seeking a highly skilled and experienced Senior AI Engineer specializing in Machine Learning Operations (MLOps) to join our fully remote team. This is a critical role where you will be responsible for building, deploying, and managing scalable and robust machine learning pipelines and infrastructure. You will bridge the gap between data science and software engineering, ensuring the seamless operation and continuous improvement of our AI models.
Responsibilities: Design, implement, and maintain robust MLOps pipelines for model training, validation, deployment, and monitoring. Develop and manage infrastructure for scalable machine learning workloads, leveraging cloud platforms (AWS, Azure, GCP). Automate machine learning workflows, including data preprocessing, model training, hyperparameter tuning, and evaluation. Implement continuous integration and continuous delivery (CI/CD) practices for machine learning models. Monitor model performance in production, identify drift, and implement retraining strategies. Develop tools and frameworks to streamline the machine learning lifecycle. Collaborate closely with data scientists and software engineers to understand model requirements and deployment challenges. Ensure the security, scalability, and reliability of ML systems. Stay current with the latest MLOps best practices, tools, and technologies. Contribute to the development of AI strategy and best practices within the organization. Troubleshoot and resolve complex issues related to ML infrastructure and model deployment. Mentor junior engineers and promote MLOps principles across teams. Qualifications: Master's or Ph.D. in Computer Science, Machine Learning, or a related quantitative field. Minimum of 5 years of experience in software engineering or data science, with at least 3 years focused on MLOps or ML infrastructure. Proven experience building and managing ML pipelines using tools like Kubeflow, MLflow, Airflow, or similar. Strong proficiency in Python and relevant ML libraries (e.g., TensorFlow, PyTorch, Scikit-learn). Extensive experience with cloud platforms (AWS SageMaker, Azure ML, Google AI Platform) and containerization technologies (Docker, Kubernetes). Deep understanding of CI/CD principles and tools (e.g., Jenkins, GitLab CI, GitHub Actions). Experience with data engineering principles and technologies. Strong understanding of machine learning concepts, algorithms, and evaluation metrics. Excellent problem-solving, debugging, and analytical skills. Ability to work effectively in a fast-paced, collaborative, and remote environment. Strong communication skills, with the ability to convey complex technical concepts clearly. If you are a passionate MLOps expert looking to drive the operational excellence of AI systems in a fully remote setting, this is the role for you. Join our client and shape the future of applied AI.
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free