Skip to content
mimi

AI Inference Engineer (vLLM and Kubernetes) PORTUGUESE SPEAKING PERSON

Elektu

Centurion · On-site Full-time 2w ago

About the role

Introduction

12-month contract.

The AI Inference Engineer (vLLM and Kubernetes) is a critical, highly specialized role sitting at the vanguard of the modern MLOps landscape. This position is designed for a senior engineer who possesses a unique hybrid of deep Red Hat Enterprise Linux (RHEL) systems administration expertise and modern AI infrastructure knowledge.

Duties & Responsibilities

Infrastructure Reliability: Ensure 99.99% availability of our LLM inference endpoints through robust Kubernetes orchestration on RHEL nodes.

Operational Excellence: Implement end-to-end automation with Ansible to ensure that a new GPU-enabled node can be provisioned, hardened, and added to the cluster with zero manual intervention.

Cost Efficiency: Monitor and optimize GPU utilization (NVIDIA/AMD) to ensure we are achieving the highest possible tokens-per-second.

Security & Compliance: Harden the RHEL environment and container stack to meet the evolving POPIA and international data security standards relevant in 2026

Desired Experience & Qualification

5 years +

Red Hat Certified Engineer (RHCE) or Red Hat Certified Architect (RHCA)

Certified Kubernetes Administrator (CKA) or Red Hat Certified Specialist in OpenShift

AWS Certified Solutions Architect (Pro) or Azure Solutions Architect Expert

Package & Remuneration

TBC

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free