Skip to content
mimi

GPU Infrastructure & AI Platform Engineer

Link Consulting Services

Herndon · On-site Full-time Mid Level 3d ago

About the role

Role Overview:

We are seeking a hands-on engineer to deliver end-to-end GPU infrastructure and AI/GenAI environment deployment within a lab or data center. This role covers hardware installation, platform setup, infrastructure optimization, and monitoring implementation, ensuring a fully operational and validated environment.

Key Responsibilities:

  • Install and rack-mount GPU servers, including cabling, firmware/OS baseline configuration, driver installation, and integration testing
  • Set up AI/GenAI environments using container runtimes (Docker/Kubernetes) and deploy inference tooling, delivering at least one validated use case
  • Perform rack modernization and infrastructure cleanup, including audit, optimized rack design, equipment reorganization, and structured power/data cable remediation
  • Implement monitoring solutions for GPU servers and lab infrastructure, including dashboards, alerts, agent deployment, and documentation handover

Required Skills:

  • Experience with GPU servers and data center environments (rack, power, cabling)
  • Strong Linux administration and system configuration
  • Knowledge of GPU drivers, CUDA, and performance validation
  • Experience with Docker and/or Kubernetes
  • Familiarity with AI/GenAI inference tools (e.g., Triton, vLLM, Ollama, or similar)
  • Experience with monitoring tools (Prometheus/Grafana, Zabbix, or equivalent)

Experience:

  • 5+ years in systems, infrastructure, or data center engineering
  • Proven experience delivering GPU or AI infrastructure deployments

Skills

AICUDADockerGenAIGrafanaGPUKubernetesLinuxOllamaPrometheusTritonvLLMZabbix

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free