All jobs

Data Center Engineering Operations Engineer

Amazon India Limited

Hyderabad · On-site Full-time Mid Level Today

Apply with a tailored resume Save job

About the role

Below is a quick‑start guide you can use to turn this posting into a winning application.
It includes:

A concise “Job‑at‑a‑glance” summary you can paste into a notes file or the “About Me” section of your résumé.
Key keywords & skills to sprinkle throughout your résumé, LinkedIn profile, and interview answers.
Resume‑bullet templates (with measurable impact) that map directly to the responsibilities and qualifications the hiring team is looking for.
A fully‑customizable cover‑letter draft (≈ 350 words) that hits every required and preferred qualification while showcasing your unique story.
Interview‑prep cheat sheet – the top 8 questions you’re likely to face and a framework for answering them (STAR + “scale” focus).

Feel free to copy‑paste, edit, and reuse any part of it. If you’d like a deeper dive (e.g., a full‑length résumé rewrite, a list of projects to showcase, or mock interview practice), just let me know!

1️⃣ Job‑at‑a‑glance (One‑sentence elevator pitch)

Data Center Engineering Operations Engineer – Technical SME responsible for 24/7 uptime of AWS‑scale critical infrastructure (generators, UPS, PDUs, chillers, BAS, fire/life‑safety), driving root‑cause analysis, vendor management, preventative maintenance, and continuous‑improvement initiatives to meet 99.999 % availability while reducing cost and improving sustainability.

2️⃣ Keywords & Skills to Mirror

Category	Keywords (copy verbatim)
Core Technical	generators, switchgear, UPS systems, PDUs, chillers, cooling towers, air handlers, building automation systems (BAS), fire/life‑safety equipment, electrical & mechanical troubleshooting, preventive & corrective maintenance
Process & Ops	root‑cause analysis, corrective actions, performance benchmarks, dashboards, metrics, capacity planning, incident response, SLA adherence, 24/7 uptime, 99.999 % availability
Leadership & Collaboration	vendor management, contractor oversight, sub‑contractor compliance, cross‑functional collaboration, DCO managers, business leaders, continuous‑improvement initiatives
Safety & Compliance	safety standards, environmental regulations, OSHA, NFPA, local legislation, safety audits
Tools & Software	Microsoft Office (Outlook, Word, Excel), data‑center monitoring platforms, CMMS (e.g., Maximo, ServiceNow), reporting tools
Preferred Extras	construction/project management, Electrical/Mechanical Journeyman License, DC II / DC III operating engineering license

Tip: Use exact phrasing from the posting wherever possible (e.g., “technical subject matter expert,” “risk management,” “capacity planning”) – ATS systems love exact matches.

3️⃣ Resume‑Bullet Templates (Tailor with your numbers)

[Action verb] + [Task/Technology] + [Result/Metric]

Responsibility (from posting)	Sample bullet (replace X/Y/Z)
Hands‑on troubleshooting of critical infrastructure (generators, UPS, chillers, BAS)	Diagnosed & resolved critical generator and UPS failures across a 2 MW data‑center, restoring power within 12 minutes (vs. 30‑minute SLA) and preventing $250K in potential downtime loss.
Managing contractors & vendors for compliance	Oversaw 15+ external contractors (electrical, HVAC, fire‑safety) ensuring 100 % compliance with OSHA/NFPA standards during a $12M expansion, avoiding $500K in penalties.
Conducting root‑cause analysis & corrective actions	Led RCA for a recurring PDU overload issue, identified a mis‑rated circuit breaker, and implemented a redesign that cut related incidents by 85 % over 6 months.
Establishing performance benchmarks & reporting	Created a KPI dashboard (uptime, MTTF, energy efficiency) in Excel/PowerBI that reduced reporting time from 3 days to 4 hours and highlighted a 3 % energy‑use reduction opportunity.
Supporting rack installation with power & cooling provisioning	Co‑ordinated 200+ rack deployments, verifying power‑draw and cooling capacity, achieving 99.999 % rack‑level availability during the rollout.
Implementing preventative & corrective maintenance programs	Designed a preventive‑maintenance schedule for chillers and cooling towers, increasing equipment MTTF by 22 % and cutting emergency service calls by 40 %.
Developing metrics & dashboards for facility performance	Built automated data‑center health dashboards (temperature, humidity, power quality) that alerted teams within 30 seconds of threshold breaches, reducing mean‑time‑to‑detect by 70 %.
Ensuring safety & environmental compliance	Conducted quarterly safety audits and fire‑system tests, achieving zero safety violations for 24 consecutive months.
Leading continuous‑improvement initiatives	Piloted a “green‑rack” program that reclaimed waste heat for on‑site office heating, delivering $150K annual energy savings and supporting AWS sustainability goals.

How to use: Pick 4‑6 bullets that best reflect your experience. Quantify whenever possible (time saved, cost avoided, % improvement, MW capacity, number of incidents, etc.). Keep each bullet ≤ 2 lines.

4️⃣ Cover‑Letter Draft (≈ 350 words)

[Your Name]
[Address] • [Phone] • [Email] • [LinkedIn]
6 April 2026

Hiring Manager
Amazon Web Services – Data Center Engineering Operations
[Location]

Dear Hiring Manager,

I am excited to apply for the Data Center Engineering Operations Engineer role at AWS. With a B.S. in Electrical Engineering, 7 years of hands‑on experience in large‑scale data‑center operations, and a proven track record of delivering 99.999 % uptime while driving cost‑saving initiatives, I am confident I can help AWS keep its global infrastructure running flawlessly.

In my most recent position at [Current/Previous Company], I served as the technical subject‑matter expert for a 3 MW campus that housed over 1,500 racks. I led the troubleshooting of generators, UPS systems, PDUs, chillers, and building‑automation controls, consistently meeting or beating SLAs—most notably restoring power after a generator trip in 12 minutes, well under the 30‑minute target. My root‑cause analyses have eliminated recurring failures, reducing related incidents by 85 % and saving the organization >$250 K in potential downtime.

I have extensive experience managing contractors and vendors, ensuring 100 % compliance with OSHA, NFPA, and local regulations during a $12 M expansion project. By establishing performance benchmarks and building automated dashboards in Excel/PowerBI, I cut reporting time from three days to four hours and identified a 3 % energy‑efficiency improvement that was subsequently rolled out across the site.

My background also includes construction and project‑management responsibilities—overseeing rack installations, power‑and‑cooling provisioning, and de‑commissioning activities—while maintaining strict safety standards. I hold an Electrical Journeyman License and am pursuing the DC III operating engineering license, aligning perfectly with your preferred qualifications.

AWS’s commitment to innovation, safety, and sustainability resonates with my own professional values. I am eager to bring my blend of technical expertise, vendor‑management acumen, and continuous‑improvement mindset to the Data Center Engineering Operations team, helping AWS deliver the reliability its customers depend on.

Thank you for considering my application. I look forward to the opportunity to discuss how my experience can contribute to AWS’s mission of powering the world’s most demanding workloads.

Sincerely,
[Your Name]

Tips for polishing:

Replace bracketed placeholders with your actual data.
Add a sentence that references a recent AWS news item (e.g., a new region launch) to show you’re up‑to‑date.
Keep the tone professional yet enthusiastic—AWS values “bold ideas” and “curiosity.”

5️⃣ Interview‑Prep Cheat Sheet

Likely Question	STAR‑plus‑Scale Answer Framework
Tell me about a time you resolved a critical infrastructure failure.	Situation: Generator trip at 2 MW site, SLA 30 min. Task: Restore power, prevent impact on customers. Action: Ran RCA, coordinated with vendor, swapped to standby generator, performed load‑balance. Result: Power restored in 12 min, zero customer impact, saved $250 K. Scale: Impacted 1,500 racks, 99.999 % uptime maintained.
How do you manage multiple contractors while ensuring safety compliance?	Situation: $12M expansion with 15 contractors. Task: Keep work on schedule, meet OSHA/NFPA. Action: Created daily safety brief, audit checklist, centralized CMMS for punch‑list tracking. Result: 100 % compliance, zero recordable incidents, finished 2 weeks early.
Describe a continuous‑improvement project you led.	Situation: High PDU overload alerts. Task: Reduce incidents. Action: Analyzed load data, re‑rated circuits, implemented automated alerts. Result: 85 % drop in overloads, saved $150K in overtime. Scale: Affected 200 racks, improved overall facility reliability.
How do you prioritize preventive maintenance vs. urgent repairs?	Situation: Limited crew, both PM schedule and emergency UPS fault. Task: Balance resources. Action: Used risk‑based scoring (MTBF, criticality), deferred low‑risk PM, allocated crew to UPS, completed PM after. Result: Critical issue fixed within SLA, PM backlog < 5 %, no SLA breach.
Give an example of how you used data/metrics to drive decisions.	Situation: Energy‑usage variance across zones. Task: Identify inefficiencies. Action: Built PowerBI dashboard pulling BAS data, highlighted 3 % over‑consumption zone. Result: Adjusted cooling set‑points, saved $200K annually, reduced carbon footprint.
Why AWS and why this role?	Situation: Looking for scale & impact. Task: Join a team that powers global innovation. Action: Highlight AWS’s 99.999 % uptime goal, your alignment with safety, sustainability, and continuous improvement. Result: Show enthusiasm for contributing to world‑class infrastructure.
Technical depth – Explain how a UPS works and how you’d troubleshoot a fault.	Brief Theory: UPS converts AC to DC to charge batteries, then back to AC; provides ride‑through. Troubleshoot Steps: 1) Verify input voltage, 2) Check battery health via BMS, 3) Inspect inverter output, 4) Review event logs, 5) Isolate fault (e.g., bypass module). Result: Provide example where you applied this flow.
Behavioral – Describe a time you disagreed with a senior leader and how you handled it.	Situation: Leader wanted to defer a safety upgrade. Task: Convince to prioritize. Action: Gathered data on risk, presented cost‑benefit analysis, proposed phased rollout. Result: Upgrade approved, avoided potential incident, maintained trust.

Quick tip: For every answer, quantify the impact (cost saved, downtime avoided, % improvement) and tie it back to AWS’s 99.999 % uptime and cost‑efficiency goals.

What to Do Next

Update your résumé using the bullet templates (keep it to 1‑2 pages).
Paste the cover‑letter into a Word doc, replace placeholders, and tailor the opening paragraph to a recent AWS news item.
Run a keyword check (e.g., copy the job posting into a free ATS‑scanner) to ensure you have > 85 % match.
Practice the STAR‑plus‑Scale framework for the 8 interview questions above—record yourself or do a mock with a friend.
Gather supporting artifacts (certificates, project photos, performance dashboards) in case the interviewers ask for evidence.

Good luck! 🎉 If you’d like a full‑length résumé rewrite, a deeper dive into any of the interview questions, or help polishing the cover letter, just let me know.

Requirements

Bachelor's degree in Electrical Engineering, Mechanical Engineering, or a related field
Experience with Microsoft Office including Outlook, Word, and Excel
5+ years of relevant work in a data center or other critical environment experience
5+ years of experience in data center engineering/operations, including work in data centers or other critical environments and vendor management

Responsibilities

Serving as technical subject matter expert for hands-on troubleshooting of critical infrastructure including generators, switchgear, UPS systems, PDUs, chillers, cooling towers, air handlers, and building automation systems
Managing contractors, sub-contractors, and vendors to ensure compliance with established practices, procedures, and local legislation
Conducting root cause analysis for operational issues and implementing corrective actions
Establishing performance benchmarks and preparing comprehensive reports on facility infrastructure operations
Collaborating with DCO managers, business leaders, and partners to coordinate projects, manage capacity, and optimize safety, performance, reliability, and sustainability
Supporting rack installation with proper power and cooling provisioning
Implementing preventative and corrective maintenance programs for critical infrastructure
Developing metrics and dashboards for facility performance monitoring
Ensuring compliance with safety standards and environmental regulations
Leading continuous improvement initiatives to achieve operational excellence and uptime targets

Skills

AWSExcelMicrosoft OfficeOutlookWord

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free

Data Center Engineering Operations Engineer

About the role

1️⃣ Job‑at‑a‑glance (One‑sentence elevator pitch)

2️⃣ Keywords & Skills to Mirror

3️⃣ Resume‑Bullet Templates (Tailor with your numbers)

4️⃣ Cover‑Letter Draft (≈ 350 words)

5️⃣ Interview‑Prep Cheat Sheet

What to Do Next

Requirements

Responsibilities

Skills

Similar roles

HR Data Analyst

Sr. AI Engineer

Cybersecurity Senior Engineer

Don't send a generic resume

Data Center Engineering Operations Engineer

About the role

1️⃣ Job‑at‑a‑glance (One‑sentence elevator pitch)

2️⃣ Keywords & Skills to Mirror

3️⃣ Resume‑Bullet Templates (Tailor with your numbers)

4️⃣ Cover‑Letter Draft (≈ 350 words)

5️⃣ Interview‑Prep Cheat Sheet

What to Do Next

Requirements

Responsibilities

Skills

Similar roles

HR Data Analyst

Sr. AI Engineer

Cybersecurity Senior Engineer

Don't send a generic resume

4️⃣ Cover‑Letter Draft (≈ 350 words)