Skip to content
mimi

Experienced Software Engineer

fal

On-site Full-time Senior Yesterday

About the role

About

You are an experienced software engineer who thrives on building large-scale computing platforms. You have deep expertise in large scale distributed systems that deal with high complexity, a lot of traffic and data. You know how to achieve reliability and scale with minimum operational load.

Key Responsibilities

  • Build our core Python/Rust platform: request routing, AI workload orchestration, scheduling, GPU autoscaling, large scale file storage, queueing, etc
  • Produce forward designs for platform evolution as we scale to 100x current traffic and need to provide low latency across the world
  • Leverage AI to an extreme level to automate the mundane parts of building complex but reliable systems
  • Profile and tune low level CPU and memory performance

Requirements

  • 5+ years experience building distributed compute and orchestration platforms in Python or Rust
  • Strong understanding of distributed systems fundamentals: consensus, scheduling, fault tolerance, capacity planning
  • Deep understanding of computational complexity and memory allocation
  • Track record of designing systems that scale under real production load
  • Experience building and using observability to drive performance and reliability decisions
  • Excellent communication and ability to drive technical decisions across teams
  • Self-starter who executes quickly, takes ownership, and constantly seeks improvement

Nice to have

  • Experience with AI/ML inference or training infrastructure
  • Experience with high-performance systems programming (async runtimes, zero-copy, memory-safe concurrency)
  • Background in building multi-tenant compute platforms
  • Understanding of networking fundamentals and performance characteristics
  • Familiarity with GPU workload characteristics and scheduling constraints

Location

Turkey

What we offer at fal

  • Interesting and challenging work
  • A lot of learning and growth opportunities
  • Regular team events and offsites

Skills

AIAWS Lambdaasync runtimescapacity planningconcurrencyCPUDockerfault toleranceGPUKuberneteslow-level systems programmingmemory allocationmulti-tenancynetworkingobservabilityorchestrationPostgreSQLPythonqueueingRustschedulingzero-copy

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free