Google Cloud Platform (GCP) AIOps Cloud Engineer
KPMG India Services LLP
About the role
GCP Cloud Engineer Consultant
Job Title
GCP AIOps Cloud Engineer
Location
Bangalore / India (Hybrid/Onsite as applicable)
Experience
Mandatory Skills
• 4 to 7 years of hands-on experience with Google Cloud Platform (GCP).
Role Summary
We are looking for a GCP AIOps Cloud Engineer to design, implement, and operate scalable, reliable, and intelligent cloud platforms on Google Cloud Platform (GCP). The role focuses on Cloud Operations, Observability, Automation, and AIOps, leveraging AI/ML‑driven insights to proactively detect anomalies, reduce incidents, and improve system reliability.
Key Responsibilities Cloud Engineering & Operations • Design, build, and manage highly available and resilient GCP infrastructure • Operate and optimize production workloads on GCP (GKE, Compute Engine, Cloud Run, Cloud Functions) • Implement Infrastructure as Code (IaC) using Terraform • Ensure security, reliability, scalability, and cost optimization across environments
AIOps & Observability • Implement AIOps‑driven monitoring and alerting using: • Cloud Monitoring, Logging, Metrics Explorer • AI‑assisted anomaly detection and alert correlation • Build proactive incident detection, root cause analysis, and noise reduction • Enable MTTR reduction through intelligent operational insights • Integrate observability platforms with ITSM tools (e.g., ServiceNow – where applicable)
DevOps & Automation • Design and maintain CI/CD pipelines (Cloud Build, GitHub, Docker, Kubernetes) • Automate deployments and operational workflows • Support GitOps / DevOps best practices • Enable self‑healing and auto‑remediation mechanisms
Containers & Platform Engineering • Manage and optimize Google Kubernetes Engine (GKE) • Support containerized and microservices‑based architectures • Implement scaling, resilience, and workload optimization strategies
Reliability, Security & Governance • Apply SRE principles (SLIs, SLOs, error budgets) • Implement IAM, VPC networking, firewall rules, and security best practices • Support compliance, audit, and governance requirements • Perform capacity planning and cost optimization (FinOps awareness)
Collaboration & Delivery • Work closely with application, data, and platform teams • Participate in design reviews, incident postmortems, and operational readiness reviews • Contribute to reusable accelerators, templates, and best practices
Required Skills & Qualifications Core GCP Skills • Strong hands‑on experience with: • GKE, Compute Engine, Cloud Run, Cloud Functions • Cloud SQL, BigQuery, Cloud Storage • VPC, IAM, Load Balancing • Solid Linux and networking fundamentals
AIOps & Monitoring • Experience with cloud observability and AIOps concepts • Knowledge of anomaly detection, alert correlation, and event noise reduction • Experience implementing monitoring dashboards and alerts
DevOps & Automation • Terraform for IaC • Docker & Kubernetes • CI/CD tools (Cloud Build, GitHub Actions or equivalent) • Scripting (Python, Bash)
Nice to Have • Multi‑cloud exposure (AWS / Azure) • ServiceNow or ITSM integrations • FinOps / cost optimization experience • Exposure to ML‑assisted operations or SRE tooling
Certifications (Preferred) • Google Cloud Associate Cloud Engineer • Google Cloud Professional DevOps Engineer • Google Cloud Professional SRE / Architect (plus)
What Success Looks Like • Reduced incident frequency and MTTR through AIOps • Stable, secure, and scalable GCP platforms • High automation, minimal manual operations • Strong collaboration across engineering teams
Primary Roles and Responsibilities
— Design, engineering, integration, and enhancements of Agile and DevOps enablement tools and applications by utilizing DevOps principles
— Follow SDLC process and practices (Functional Specifications and Testing, Design Specifications, Code Reviews, Unit Testing, Monitoring)
— Implement, manage and fine-tune monitoring and alerting systems to ensure robust system performance and swift incident response
— Utilize Python and Groovy for scripting and automation tasks related to tools administration and integrations
— Work closely with security teams to ensure tool compliance with organizational security policies
— Reduce Toil, increase automation, evaluate new technologies (Open AI) and explore their applicability to address new requirements
— Solid understanding of source control systems(Git, Subversion)
— Administer Bitbucket, Jenkins and GitHub to automate build and deployment processes
— Manage and configure jFrog, Artifactory and NexusIQ for effective package repository maintenance
— Administer SonarQube for continuous code quality assessments
— Configuring, building, and deploying applications into on-premise/cloud(DevOps Pipelines, GitHub Actions)
— Administer, manage and configure Jira and Confluence to foster team collaboration and productivity
Preferred Skills
— Must Have: Proficiency in Artifactory, NexusIQ, SonarQube, Jenkins, Docker, Bitbucket, GitHub, Python, Groovy, Jira, Confluence, Remedy, ServiceNow
— Nice-to Have: Expertise in Terraform, AKS, Azure DevOps, AWS, Azure services • Bachelor's degree in Computer Science, Engineering, or a related field • 5+ years of experience in DevOps role • Excellent problem-solving skills Experience Level Senior Level
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free