Skip to content
mimi

Platform Site Reliability Specialist

Tandym Group

New York · Hybrid Full-time Mid Level 1mo ago

About the role

About

A recognized media services company is seeking a new Platform Site Reliability Specialist to join their growing team, focusing on the automation of monitoring and observability solutions that ensure infrastructure reliability.

This is a Hybrid opportunity requiring the qualified professional to work onsite at least 3 days a week.

Responsibilities

  • Implementing and maintaining automated workflows to streamline manual processes.
  • Building comprehensive dashboards for real-time system visibility.
  • Developing proactive alerting systems to identify performance bottlenecks.
  • Standardizing monitoring protocols for consistent service reliability.
  • Providing rapid response and resolution during system disruptions
  • Performing other duties, as needed

Qualifications

  • 3+ years of experience in Technical Operations and Automation roles.
  • High School Diploma / GED
  • Proficiency with observability tools like Datadog or Zabbix
  • Automation scripting capabilities
  • Strong analytical and problem-solving skills
  • Ability to participate in a 24/7 on-call rotation
  • Attention to detail.
  • Effective communication skills

Desired Qualifications

  • Bachelor's Degree in a Computer-related field
  • Experience in Media and Entertainment, particularly live-streaming or CDN management
  • Solid understanding of Agile frameworks

Skills

DatadogZabbix

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free