VC
AI DevOps/Platform Engineers
Vish Consulting Services
Remote · Canada Full-time Senior $40 – $50/hr 4d ago
About the role
About
The AI DevOps/Platform Engineer will join the AI Enablement team, focusing on building, maintaining, and scaling enterprise AI infrastructure. This includes proprietary agent orchestration platforms (NOVA), AI gateway services, and Retrieval-Augmented Generation (RAG) pipelines across multi-cloud environments.
Responsibilities
- Platform Development & Operations:
- Develop, deploy, and maintain the NOVA agentic AI platform
- Manage LiteLLM as the central AI gateway
- Optimize LLM routing, cost control, load balancing, and failover
- Implement monitoring and observability (Prometheus, Grafana, OpenTelemetry)
- RAG Pipeline Development:
- Design and optimize RAG pipelines
- Maintain document ingestion, chunking, embeddings, and vector stores
- Build RAG on GCP and Azure using managed AI services and vector databases
- Infrastructure & DevOps:
- Deploy AI services on Kubernetes (AKS, GKE)
- Implement CI/CD with Jenkins, Opsera, GitHub Actions
- Automate infrastructure (Terraform, Helm, GitOps)
- Ensure security and compliance
- Agentic AI & Automation:
- Develop automation tools and scripts
- Build MCP servers for tool integrations
- Enable multi-agent orchestration and autonomous workflows
- Create SDKs, APIs, and developer documentation
Required Qualifications
- 8+ years platform engineering/DevOps experience
- 2+ years AI/ML or LLM platform experience
- Strong Kubernetes, CI/CD, and cloud experience (GCP or Azure)
- Proficiency in Python and/or TypeScript
Technical Environment
- AI Platforms: LiteLLM, LangChain, LangGraph
- Cloud: GCP, Azure
- Containers: Kubernetes, Docker, Helm
- CI/CD: Jenkins, GitHub Actions, Opsera
- Observability: Prometheus, Grafana, OpenTelemetry, Dynatrace
- Languages: Python, TypeScript, Bash
Job Type
Permanent
Pay
$40.00‑$50.00 per hour
Work Location
Remote
Requirements
- 8+ years platform engineering/DevOps experience
- 2+ years AI/ML or LLM platform experience
- Strong Kubernetes, CI/CD, and cloud experience (GCP or Azure)
- Proficiency in Python and/or TypeScript
Responsibilities
- Develop, deploy, and maintain the NOVA agentic AI platform
- Manage LiteLLM as the central AI gateway
- Optimize LLM routing, cost control, load balancing, and failover
- Implement monitoring and observability
- Design and optimize RAG pipelines
- Maintain document ingestion, chunking, embeddings, and vector stores
- Build RAG on GCP and Azure using managed AI services and vector databases
- Deploy AI services on Kubernetes (AKS, GKE)
- Implement CI/CD with Jenkins, Opsera, GitHub Actions
- Automate infrastructure
- Ensure security and compliance
- Develop automation tools and scripts
- Build MCP servers for tool integrations
- Enable multi-agent orchestration and autonomous workflows
- Create SDKs, APIs, and developer documentation
Skills
BashCI/CDDockerDynatraceGCPGitHub ActionsGrafanaHelmJenkinsKubernetesLangChainLangGraphLiteLLMOpenTelemetryOpseraPrometheusPythonTerraformTypeScript
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free