HB
Platform / Site Reliability Engineer (SRE)
H&R Block
Kitchener · On-site Full-time Senior 4d ago
About the role
About
Our client is transforming industries through cutting‑edge technology. Their platform leverages AI, automation, and scalable systems to solve complex real‑world problems. As a Platform / Site Reliability Engineer (SRE), you will play a key role in establishing and enhancing the engineering platform, ensuring reliability, scalability, and efficiency while developing tools that improve engineering productivity. You will help define and shape the platform strategy, set best practices, and drive initiatives that enhance developer experience, system performance, and operational efficiency.
Responsibilities
- DevOps & Infrastructure: Design, implement, and maintain scalable infrastructure to support engineering needs.
- CI/CD Optimization: Improve continuous integration and deployment pipelines using AWS CDK, including requirements for deployment and database migration tooling.
- Release Tracking & Deployment: Establish visibility into release cycles, implement automation to streamline deployments, and ensure smooth rollouts.
- Site Reliability & Observability: Implement monitoring, logging, and alerting systems to ensure high availability and performance.
- Internal Tooling: Build and maintain tools that improve developer efficiency, automate repetitive tasks, and enhance productivity.
- Security & Compliance: Ensure infrastructure and deployments align with security best practices, with attention to SOC, ISO, and GDPR standards.
Requirements
- 7+ years of technical experience, with 5+ years as an SRE or similar role.
- Startup experience is a plus.
- Deep expertise in AWS, including Fargate and Kubernetes for container orchestration.
- Strong experience with CI/CD pipelines, particularly using AWS CDK.
- Proficiency with observability tools (Datadog, Prometheus, Grafana).
- Strong knowledge of scaling strategies and highly available architectures.
- Proficiency in scripting/automation with Python, Bash, or TypeScript.
- Familiarity with security best practices and compliance frameworks (SOC, ISO, GDPR).
- Strong collaboration skills and ability to work cross‑functionally.
Tech Stack
- Infrastructure: AWS, Fargate, Redis, PostgreSQL, SQS, CDK, GitHub, Retool
- Backend: Django REST framework, Celery
- Frontend: Next.js, Tailwind CSS
- LLM Integrations: OpenAI, Claude, AWS Bedrock
Job ID: #J-18808-Ljbffr
Skills
AWSAWS BedrockBashCDKCeleryCI/CDDatadogDjango REST frameworkFargateGrafanaGitHubISOKubernetesNext.jsOpenAIPostgreSQLPrometheusPythonRetoolSQSSoCTailwind CSSTypeScript
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free