Skip to content
mimi

Staff Software Engineer, Machine Learning

Crunchyroll

Los Angeles · On-site Full-time Lead $200k – $249k/yr Today

About the role

Position Overview

Crunchyroll, founded by fans, is the world’s largest destination for anime and manga, serving over 100 million fans across 200+ countries. As part of the Platform Development organization, we are seeking a Staff Software Engineer in Los Angeles to drive the design and evolution of core platform services—including authentication, security, notifications, and ML inference runtimes. You will work closely with ML, data science, and engineering teams, reporting to the Engineering Manager, Platform, to ensure reliability, scalability, and performance for our global audience.

Key Responsibilities

  • Architect, build, and maintain ML inference runtimes for multi-model serving, autoscaling, and GPU/TPU utilization.
  • Optimize inference pipelines and platform services for performance, reliability, and scalability.
  • Lead deployment, operationalization, and maintenance of ML workloads in collaboration with ML and data science teams.
  • Shape and maintain core platform services including authentication, security, and notifications.
  • Ensure seamless integration with platform infrastructure, CI/CD pipelines, and observability systems.
  • Define scalable system architectures and guide cross-team design alignment.
  • Develop benchmarking, validation, and monitoring tools to measure and maintain system performance.
  • Promote security, compliance, and engineering best practices across platform and ML services.
  • Mentor and influence engineering peers, fostering technical excellence and consistent standards.

Required Qualifications

  • 12+ years of backend software engineering experience, with a proven track record leading complex projects end-to-end.
  • Hands‑on experience building and optimizing AI/ML inference runtimes (e.g., KServe, TorchServe, TensorRT, Triton) and integrating with CI/CD and MLOps pipelines (e.g., SageMaker, Kubeflow, BentoML).
  • Expertise in JavaScript/TypeScript with additional experience in Golang or Kotlin.
  • Experience with containers, orchestration (Kubernetes/ECS), cloud platforms (AWS preferred), and distributed systems.
  • Proficiency in performance profiling, model optimization, and designing inference workloads to meet latency/throughput SLAs.
  • Experienced in building scalable APIs (REST/gRPC), caching strategies, and high‑performance systems including relational and NoSQL databases.
  • Familiarity with monitoring, observability tools, and security/compliance best practices in production ML/AI services.
  • Strong communication skills, problem‑solving ability, and commitment to engineering best practices.
  • Bachelor’s degree in Computer Science, Engineering, or a related field — or equivalent practical experience.

Benefits & Perks

  • Compensation: Great compensation package including salary plus performance bonus earning potential.
  • Pay Range: $200,000–$249,000 USD (actual pay may vary based on experience, location, and performance).
  • Flexible time off policies to help you be your whole self.
  • Generous medical, dental, vision, STD, LTD, and life insurance coverage.
  • Health Saving Account (HSA) program and dependent care FSA.
  • 401(k) plan with employer match and employer paid commuter benefits.
  • Support program for new parents and pet insurance with pet‑friendly office options.

Requirements

  • 12+ years of backend software engineering experience, with a proven track record leading complex projects end-to-end
  • Hands-on experience building and optimizing AI/ML inference runtimes (e.g., KServe, TorchServe, TensorRT, Triton) and integrating with CI/CD and MLOps pipelines (e.g., SageMaker, Kubeflow, BentoML)
  • Expertise in JavaScript/TypeScript with additional experience in Golang or Kotlin
  • Proficiency in performance profiling, model optimization, and designing inference workloads to meet latency/throughput SLAs
  • Experienced in building scalable APIs (REST/gRPC), caching strategies, and high-performance systems including relational and NoSQL databases
  • Familiarity with monitoring, observability tools, and security/compliance best practices in production ML/AI services
  • Strong communication skills, problem-solving ability, and commitment to engineering best practices
  • Bachelor’s degree in Computer Science, Engineering, or a related field — or equivalent practical experience

Responsibilities

  • You will work closely with ML, data science, and engineering teams, reporting to the Engineering Manager, Platform, to ensure reliability, scalability, and performance for our global audience
  • Architect, build, and maintain ML inference runtimes for multi-model serving, autoscaling, and GPU/TPU utilization
  • Optimize inference pipelines and platform services for performance, reliability, and scalability
  • Lead deployment, operationalization, and maintenance of ML workloads in collaboration with ML and data science teams
  • Shape and maintain core platform services including authentication, security, and notifications
  • Ensure seamless integration with platform infrastructure, CI/CD pipelines, and observability systems
  • Define scalable system architectures and guide cross-team design alignment
  • Develop benchmarking, validation, and monitoring tools to measure and maintain system performance
  • Promote security, compliance, and engineering best practices across platform and ML services
  • Mentor and influence engineering peers, fostering technical excellence and consistent standards

Benefits

health_insurancedental_coverage

Skills

AWSBentoMLECSgRPCGolangJavaScriptKServeKubernetesKotlinRESTSageMakerTensorRTTorchServeTritonTypeScript

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free