Senior Site Reliability Engineer
Brainhunter Systems Ltd
About the role
Senior Site Reliability Engineer (SRE)
Seeking a Senior SRE Consultant who can own reliability, performance, own infrastructure across cloud, off/on-prem infrastructure; design resilient systems, automate deployments, and ensure performance at scale. Must be hands-on with off/on-prem infrastructure. We’re building next-gen sweepstakes gaming experiences that are fast, reliable, and highly scalable. As a Senior SRE, you’ll own the infrastructure that powers everything—primarily across on-prem and hybrid environments—ensuring our systems are resilient, performant, and built to scale.
Key Accountabilities:
- Design, build, and operate on-prem and hybrid infrastructure, with potential integration into cloud environments over time
- Architect and maintain highly available, resilient systems for real-time, high-traffic gaming workloads
- Automate deployments, infrastructure provisioning, and operational workflows (CI/CD, IaC where applicable)
- Monitor system performance, uptime, and reliability—proactively identifying and resolving issues
- Implement observability best practices (logging, metrics, tracing, alerting)
- Improve system resilience through redundancy, failover strategies, and disaster recovery planning
- Partner closely with backend and platform teams to optimize system performance and reliability
- Own incident response, postmortems, and continuous improvement of system stability
Qualifications and Skillset for this Role:
- Strong experience in Site Reliability Engineering, DevOps, or infrastructure engineering, with a focus on on-prem or hybrid environments
- Deep understanding of physical infrastructure, networking, and distributed systems
- Experience managing servers, virtualization, and data center environments
- Hands-on experience with automation, scripting, and deployment workflows
- Strong troubleshooting skills across systems, networking, and performance bottlenecks
- Experience with monitoring and observability tools (Prometheus, Grafana, Datadog, or similar)
- Solid understanding of security, redundancy, and system design for uptime and resilience
- Comfortable working autonomously in a fast-paced startup environment
- Exposure to cloud platforms (AWS, GCP, Azure) and hybrid infrastructure models
- Experience with containers and orchestration (Docker, Kubernetes)
- Familiarity with backend systems (Node.js / TypeScript environments)
- Experience supporting real-time or high-concurrency systems (gaming, fintech, etc.)
Why This Role:
- Own and shape the core infrastructure of a rapidly scaling platform
- High ownership: define how systems are built, deployed, and operated
- Work on real infrastructure challenges beyond just cloud abstractions
- Startup speed: ship fast and see impact immediately
- Fully remote, flexible environment
How to Apply:
Please email your up-to-date Resume/CV to
We appreciate all the applicants for their interest in working with us; however, only those candidates shortlisted for the next steps in the hiring process will be contacted.
Thank you, and have a wonderful day!
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free