Site Reliability Engineer/ServiceNow SaaS
NTT DATA North America
About the role
Below is a quick‑start kit you can use to apply for the Site Reliability Engineer / ServiceNow SaaS position at NTT DATA in Montreal.
It includes:
- A one‑page résumé template (with suggested wording that highlights the exact experience NTT DATA is looking for).
- A tailored cover‑letter draft you can copy‑paste and then personalize with your own details.
- A short “application checklist” to make sure you meet the posting’s mandatory items (e.g., office‑attendance requirement, on‑call rotation, etc.).
- Interview‑prep focus areas – the topics you’ll most likely be quizzed on and a few “STAR” story ideas you can flesh out.
Feel free to edit any section to match your own background and voice.
1️⃣ Résumé – One‑Page (PDF‑ready)
[Your Name]
Montreal, QC | [Phone] | [Email] | [LinkedIn] | [GitHub]
PROFESSIONAL SUMMARY
Seasoned Site Reliability Engineer with 7+ years of end‑to‑end ServiceNow administration, development, and SaaS operations, plus 7+ years of software‑engineering experience (Python, Bash, JavaScript). Proven track record of automating operational tasks, building observability pipelines, and reducing MTTR for mission‑critical enterprise services. Strong communicator who thrives in global, on‑call rotations and enjoys mentoring cross‑functional teams.
CORE COMPETENCIES
- ServiceNow Platform (Admin, Scripting, IntegrationHub)
- Python, Bash, PowerShell, JavaScript/Node.js
- Linux/Unix system administration (RHEL, Ubuntu)
- CI/CD (GitLab, Jenkins, Azure DevOps) & IaC (Terraform, Ansible)
- Observability: Prometheus, Grafana, ELK, Splunk, OpenTelemetry
- Incident Management & On‑Call (PagerDuty, Opsgenie)
- Automation & SRE tooling (Ansible, Terraform, CloudWatch, ServiceNow Orchestration)
- Documentation & Knowledge‑Base creation (Confluence, Markdown)
PROFESSIONAL EXPERIENCE
Senior Site Reliability Engineer – ServiceNow SaaS
XYZ Corp., Montreal, QC — Jan 2020 – Present
- Owned a portfolio of 12 production ServiceNow instances (ITSM, CSM, HRSD) serving > 30,000 users across North America.
- Reduced mean‑time‑to‑recover (MTTR) by 38 % through automated incident‑response playbooks built with ServiceNow Orchestration + Python scripts.
- Implemented observability stack (Prometheus + Grafana dashboards + Loki logging) that surfaced latency spikes and enabled SLO‑driven alerting for all critical services.
- Led on‑call rotation for a 24 × 7 global team (4‑hour hand‑off model) and introduced a “time‑off‑in‑lie” policy that improved engineer satisfaction scores by 22 %.
- Automated 150+ routine admin tasks (user provisioning, ACL clean‑up, data archiving) using ServiceNow Flow Designer + Python APIs, saving ~ 1,200 hrs/yr.
- Mentored 5 junior SREs on best practices for IaC, monitoring, and incident post‑mortems; authored a living “SRE Playbook” now used company‑wide.
ServiceNow Developer / DevOps Engineer
ABC Solutions, Toronto, ON — Jun 2015 – Dec 2019
- Designed and delivered custom ServiceNow applications (catalog items, workflow automations) that increased self‑service adoption by 45 %.
- Built CI/CD pipelines (GitLab → ServiceNow instance) that reduced release cycle from bi‑weekly to daily.
- Integrated ServiceNow with Azure AD, Jira, and Splunk via REST & SOAP APIs, enabling end‑to‑end ticket traceability.
- Developed Python‑based health‑check agents for on‑prem Linux servers, feeding metrics into ServiceNow Event Management.
Software Engineer
TechStart Ltd., Montreal, QC — Jan 2012 – May 2015
- Developed backend services in Python (Flask) and Bash automation scripts for internal tooling.
- Participated in full‑stack development (HTML5/CSS3/JS) for internal dashboards.
EDUCATION
B.Sc. Computer Science – Université de Montréal (2011)
CERTIFICATIONS
- ServiceNow Certified System Administrator (CSA) – 2022
- ServiceNow Certified Application Developer (CAD) – 2023
- Certified Kubernetes Administrator (CKA) – 2021 (optional, but shows cloud‑native chops)
TECHNICAL TOOLBOX (selected)
| Language | Tools / Platforms | Monitoring / Observability |
|---|---|---|
| Python (3.x) | Git, Docker, Terraform, Ansible, ServiceNow Studio | Prometheus, Grafana, ELK, Splunk, OpenTelemetry |
| Bash / PowerShell | Linux (RHEL, Ubuntu), Windows Server | CloudWatch, ServiceNow Event Management |
| JavaScript (Node.js) | ServiceNow Flow Designer, IntegrationHub | PagerDuty, Opsgenie |
2️⃣ Cover‑Letter (Tailor‑Ready)
[Your Name]
Montreal, QC | [Phone] | [Email] | [Date]Hiring Manager
NTT DATA North America
Montreal, QC
Dear Hiring Manager,
I am excited to submit my application for the Site Reliability Engineer / ServiceNow SaaS role at NTT DATA. With over seven years of hands‑on ServiceNow administration and development, combined with a solid background in Python‑driven automation, Linux operations, and observability engineering, I am confident I can help NTT DATA deliver the high‑availability, performance‑focused services your clients expect.
At XYZ Corp., I currently own a portfolio of twelve production ServiceNow instances that support more than thirty‑thousand users. By building automated remediation playbooks (Python + ServiceNow Orchestration) and a unified metrics‑alerting stack (Prometheus + Grafana + Loki), I reduced MTTR by 38 % and enabled SLO‑driven alerting across the board. My experience aligns directly with the posting’s focus on “maximizing availability and performance through optimized and automated operational tasks.”
Key achievements that map to the NTT DATA requirements:
| NTT DATA Requirement | My Relevant Experience |
|---|---|
| 7+ years ServiceNow administration | 8 years managing multiple enterprise instances, including custom app development and integration hub flows. |
| 7+ years software development (Python) | Built > 150 automation scripts and CI/CD pipelines; authored Python health‑check agents for Linux servers. |
| Troubleshooting on‑prem Linux environments | Daily use of Bash/PowerShell for server provisioning, log analysis, and incident response. |
| Observability (metrics, logging, tracing, alerting) | Designed end‑to‑end monitoring stack (Prometheus, Grafana, Loki, OpenTelemetry) and defined SLOs for all critical services. |
| On‑call rotation & incident response | Lead a 24 × 7 global on‑call rotation; introduced a “time‑off‑in‑lie” policy that improved team morale. |
| Documentation & knowledge sharing | Authored a living SRE Playbook and maintained Confluence knowledge base used by > 50 engineers. |
| Office presence (≥ 3 days/week) | Currently based in Montreal and work from the office 4 days per week. |
Beyond technical expertise, I pride myself on clear, collaborative communication—whether drafting post‑mortem reports for senior leadership or coaching junior engineers on best‑practice SRE patterns. I am eager to bring that same energy to NTT DATA’s inclusive, forward‑thinking culture.
Thank you for considering my application. I look forward to the opportunity to discuss how my background, skills, and passion for reliable SaaS platforms can contribute to NTT DATA’s continued success.
Sincerely,
[Your Name]
3️⃣ Application Checklist (Before You Hit “Submit”)
| Item | Done? | Notes |
|---|---|---|
| ✅ Updated résumé (PDF, 1‑page) – includes ServiceNow, Python, Linux, observability keywords. | ||
| ✅ Tailored cover letter (addressed to NTT DATA, includes Montreal location). | ||
| ✅ Highlighted on‑call rotation experience and office‑attendance commitment. | ||
| ✅ Added ServiceNow certifications (CSA, CAD) – attach copies if the portal allows. | ||
| ✅ Verified that your LinkedIn/GitHub showcase relevant ServiceNow scripts or open‑source Python tools. | ||
| ✅ Prepared a short “elevator pitch” (30‑sec) for phone screens: “I’m a Montreal‑based SRE with 7 years of ServiceNow admin + Python automation, known for cutting MTTR by 38 % through observability pipelines…” | ||
| ✅ Confirmed you can work ≥ 3 days per week in the Montreal office (or have a plan to discuss hybrid flexibility). | ||
| ✅ Reviewed the job posting for any required legal eligibility (work permit, etc.) and uploaded proof if needed. | ||
| ✅ Saved a copy of the job ID (J‑18808‑Ljbffr) for future reference. |
4️⃣ Interview‑Prep – What They’ll Likely Ask & Sample STAR Stories
| Topic | Possible Question | Suggested STAR Story |
|---|---|---|
| ServiceNow Administration | “Walk me through a complex workflow you built in ServiceNow and the business impact.” | Situation: Legacy ticket routing caused 2‑day delays. Task: Redesign workflow to auto‑assign based on CI‑type. Action: Used Flow Designer + Script Includes (Python‑style logic) to map CI attributes to assignment groups. Result: Reduced average resolution time by 45 % and saved ~ 800 hrs/yr. |
| Python Automation | “Give an example of a Python script you wrote to automate a repetitive SRE task.” | Situation: Manual user‑provisioning for 200+ new hires each month. Task: Automate provisioning in ServiceNow and Azure AD. Action: Developed a Python CLI using ServiceNow REST API + Azure Graph API, integrated with Jenkins. Result: Cut provisioning time from 2 days to < 30 minutes; zero provisioning errors for 12 months. |
| Observability & SLOs | “How do you define and monitor SLOs for a SaaS product?” | Situation: No clear reliability targets for a ServiceNow‑based portal. Task: Establish SLOs for availability & latency. Action: Implemented Prometheus exporters on ServiceNow instances, created Grafana dashboards, set alert thresholds in Alertmanager. Result: Achieved 99.9 % availability SLA for 6 months; early detection of latency spikes reduced incident duration by 30 %. |
| On‑Call & Incident Management | “Describe a high‑severity outage you handled. What was your role?” | Situation: ServiceNow Incident Management module went down for a major client. Task: Lead incident response, restore service, communicate status. Action: Triggered PagerDuty, executed automated rollback playbook (Ansible + ServiceNow Orchestration), coordinated with DB admin, kept stakeholders updated via Slack channel. Result: Service restored in 42 minutes (vs. 2 hrs historically); post‑mortem identified a config drift that was patched. |
| Collaboration & Documentation | “How do you ensure knowledge transfer across a distributed SRE team?” | Situation: New engineers joining a global SRE team. Task: Reduce onboarding time. Action: Created a Confluence “SRE Handbook” with runbooks, recorded video demos of common tasks, instituted weekly “Lunch‑and‑Learn” sessions. Result: Onboarding time dropped from 4 weeks to 1 week; team reported higher confidence in handling incidents. |
| Technical Debt Prioritization | “Give an example of how you identified and prioritized technical debt.” | Situation: Legacy custom scripts caused frequent failures. Task: Prioritize refactor vs. new feature work. Action: Ran static analysis, logged failure frequency, calculated ROI; presented a roadmap to leadership. Result: Secured budget to rewrite 3 critical scripts, reducing related incidents by 70 %. |
Tips for the interview:
- Speak the NTT DATA language – they emphasize “inclusive, adaptable, forward‑thinking.” Mention how you foster collaboration and mentor teammates.
- Quantify impact – use percentages, time saved, MTTR reduction, SLA improvements.
- Show SRE mindset – talk about reliability budgets, error‑budget policies, and how you balance feature velocity with stability.
- Be ready for a short live‑coding or scripting exercise – they may ask you to write a small Python function that interacts with a mock ServiceNow API. Practice using
requestsand handling pagination / error handling. - Ask thoughtful questions – e.g., “How does NTT DATA currently measure reliability for its ServiceNow SaaS offerings?” or “What are the biggest upcoming challenges for the Montreal SRE team?”
Quick Copy‑Paste for the Application Portal
Résumé File: YourName_NTTDATA_SRE_Montreal.pdf
Cover Letter File: YourName_CoverLetter_NTTDATA.pdf
When the portal asks for “Additional Information,” you can paste:
Job ID: J‑18808‑Ljbffr
Location Preference: Montreal (able to work on‑site ≥ 3 days/week)
Eligibility: Canadian citizen (or permanent resident) – authorized to work in Canada.
Final Thought
You already have the core experience NTT DATA is looking for; the key now is presentation. Use the résumé template above to make those 7+ years of ServiceNow and Python work stand out, and pair it with the cover letter that directly mirrors the language of the posting. Good luck – you’re a strong fit, and with this polished package you’ll make a compelling case for the role! 🚀
Requirements
- 7+ years of experience in Software development skills in one more programming languages, e.g. Python
- 7+ years of experience in Service Now administration
- 7+ years of experience in Development experience
- Proficient oral and written communication skills.
- Establishing warm, effective relationships with colleagues to collaborate on successful delivery.
- A dependable team worker with demonstrated commitment to client services.
- Ability to respond appropriately during occasional technical emergencies, like outages.
Responsibilities
- Delivery of improvements that will maximize the availability and performance of supported systems through optimized and automated operational tasks, collaborating on the development of operational tools, ongoing problem management, and architecture reviews with colleagues.
- Troubleshooting Service Now issues, and also some on‑premises capabilities in a Linux environment from time to time, collaborating with others get to the bottom of issues, and agreeing on lasting improvements that can be made.
- Exploring and delivering observability including metrics, logging, tracing and alerting that can define and measure the target reliability of a product.
- Being dependable and responsive during agreed hours, like when part of the on‑call rotation with the rest of the global team (with a time‑off in lieu system).
- A commitment to understanding the company's Service Now instances and related dependencies, contributing to the documentation.
- Identification and prioritization of technical debt that can impact client satisfaction or operational efficiency.
- Given feedback on policy and procedures related to the delivery of SRE and operational practices with a view to continually making the company safer and more efficient.
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free