SRE/DevOps Engineer – APAC (Fully Remote)
Discovered
About the role
Discover the Opportunity
Our client is building the future of automated trading. Their IT Operations team is looking for a Site Reliability Engineer / DevOps Engineer to own our infrastructure during critical market hours.
You'll play a key role is supporting a high-frequency trading (HFT) and Crypto environment, using Python, AI-driven automation, and AWS to ensure the systems never blink.
Discover the Responsibilities • Act as the primary responder for production incidents and HFT infrastructure issues during the APAC shift. • Contribute to reliability practices, reducing MTTR and maintaining high trading uptime. • Use Grafana and Prometheus to maintain visibility into system health and alerts. • Improve workflows and reduce manual overhead through Python-based automation.
• Discover the Requirements • 4-6 years of experience in Platform Support or Devops. • Strong Linux internals (debugging, networking) and AWS. • Familiarity with Kubernetes (K8s) fundamentals and Helm. • Ability to read, debug, and trace Python code to resolve production issues. • Experience writing alerting queries and building dashboards (Grafana/Prometheus). • A strong incident-response and operations background. • Degree from a Top Tier University
Discover the Desired • GitOps workflows or Terraform. • Experience with Kafka or message bus architectures. • Query experience with ClickHouse. • Prior experience working for a Trading company.
Discover the Benefits • Work fully remote from your home country with a global, multicultural team. • Collaborate with top-tier engineers from the world's leading Tech and trading firms. • We regularly gather in unique global locations to bond and build. • In HFT, your work is visible in the PnL immediately. Every millisecond you save matters. • Excellent Vacation days and sick days policy. • Individual budget for L&D courses
If you’re passionate about system reliability, obsessed with minimizing MTTR, and enjoy collaborating in a high-velocity environment, you’ll fit right in into this team.
Apply Now!
Requirements
- 4-6 years of experience in Platform Support or Devops
- Strong Linux internals (debugging, networking) and AWS
- Familiarity with Kubernetes (K8s) fundamentals and Helm
- Ability to read, debug, and trace Python code to resolve production issues
- Experience writing alerting queries and building dashboards (Grafana/Prometheus)
- A strong incident-response and operations background
- Degree from a Top Tier University
Responsibilities
- Act as the primary responder for production incidents and HFT infrastructure issues during the APAC shift
- Contribute to reliability practices, reducing MTTR and maintaining high trading uptime
- Use Grafana and Prometheus to maintain visibility into system health and alerts
- Improve workflows and reduce manual overhead through Python-based automation
Benefits
Skills
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free