RS
DevOps / DataOps Engineer (Private AI / AI Workbench)
RighIT Solutions LLC
On-site Contract Senior 3d ago
About the role
Job Description
The DevOps/DataOps Engineer builds and operates the deployment and data foundations for AI Workbench. This role automates infrastructure provisioning, manages Kubernetes/container operations, builds CI/CD pipelines, and implements secure data pipelines and observability so AI-driven solutions run reliably in private or hybrid environments.
Key Responsibilities
- Provision and manage environments using Infrastructure as Code (IaC) for compute, networking, storage, and security controls.
- Administer Kubernetes/container platforms to host AI Workbench services, agent runtimes, model endpoints, and supporting components.
- Build and maintain CI/CD pipelines for application/services, agent configurations, and infrastructure updates with automated checks.
- Implement DataOps pipelines for RAG ingestion: secure connectors, preprocessing jobs, scheduling, data quality checks, and lineage tracking.
- Implement observability: logs, metrics, traces, dashboards, alerting, and SLO/SLI monitoring across platform and workloads.
- Harden environments: secrets management, vulnerability scanning, image signing, policy-as-code, and least-privilege access.
- Support release management, incident response, and operational handover including runbooks and knowledge transfer.
- Optimize performance and cost: resource sizing, autoscaling policies, GPU scheduling, and storage optimization.
Required Qualifications
- 5+ years in DevOps/SRE and/or DataOps roles supporting enterprise platforms.
- Hands-on experience with Kubernetes and container tooling.
- Experience building CI/CD pipelines and IaC automation.
- Strong knowledge of security practices in platform operations.
- Experience with data pipelines and ETL/ELT concepts.
Preferred Qualifications
- Experience supporting AI/ML platforms, GPU workloads, or model inference services.
- Experience with policy-as-code (OPA/Gatekeeper-like concepts) and compliance-driven operations.
- Familiarity with vector databases and search indexing operations.
- Experience with hybrid connectivity and network security patterns.
Key Skills
- Kubernetes, containers, Helm/GitOps concepts
- IaC (Terraform/Ansible), CI/CD
- Data pipelines, scheduling, data quality
- Security hardening, secrets, scanning
- Monitoring/observability (metrics, logs, tracing)
Skills
AnsibleCI/CDContainersData QualityData PipelinesETLGitOpsHelmIaCKubernetesMetricsMonitoringObservabilitySchedulingSecrets ManagementSecurity HardeningTerraformTracingVulnerability Scanning
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free