Site Reliability Engineer - Observability & Internal Tools
smartclip
About the role
Your role in the team
• Remote in our day-to-day work. On-site when it matters.
• We work remote by default - focused, efficient, and with full ownership. For larger features, architectural decisions, and real brainstorming sessions, we come together in Berlin or Cologne - fast, hands-on, and without unnecessary meeting overhead.
• We use AI to accelerate - not to replace thinking.
• We design the system, steer the output, and take responsibility for what we ship.
• Fast where it makes sense. Careful where it matters.
• Take full ownership of smartclip's internal utility and platform tooling.
• Focus your energy on the intersection of observability, automation, and developer infrastructure.
• Don't just maintain existing systems - evolve them, research cutting-edge open-source alternatives, and implement them.
• Forget expensive enterprise SaaS. Invest in deep in-house expertise.
• Understand our systems end-to-end, maintain total flexibility, and contribute back to the open-source ecosystem we depend on.
• Build & Evolve: Operate and advance our observability stack (including Prometheus, Grafana, and Forgejo).
• Go Open Source First: Replace 'buy' decisions with robust 'build & maintain' strategies.
• Engineer the Platform: Design observability as a platform capability. Define SLOs and create actionable alerting to stop incidents before they start.
• Secure the Stack: Embed security engineering into the delivery process. Find vulnerabilities before the pen tests do.
• Master the Infrastructure: Navigate Linux systems and distributed tooling. Balance bold exploration with production stability.
What we offer
• Ownership over tickets: You're trusted with real responsibility, not just tasks. No unnecessary bureaucracy, no micromanagement - we rely on you to take things forward.
• Build > Talk: We test what works - not what sounds good. Fail fast, learn faster.
• High standards, low ego: We take our work seriously, but not ourselves. Direct feedback, honest collaboration, no drama.
• Stay sharp: Hackathons, conferences, community - we invest in your growth and keep you at the cutting edge.
• Remote flexibility. In person, when it matters.: You work flexibly remote, with a connection to our Berlin or Cologne locations, where our TV Labs are and we experiment, build, and learn together.
• And yes - the fundamentals are covered too: 30 days of vacation + Dec 24 & 31 off, Smart Fridays (4 days week possible), mobility (Germany ticket & JobRad), sports & health offerings, mental health support, corporate benefits, RTL+ access, and more.
Technologies and skills
• Google Cloud Platform
• Linux
• Prometheus
• Grafana
Our expectations:
Qualifications
• Sei motiviert durch systemisches Denken und eine tiefgehende technische Neugier.
• Stop being a consumer - start being a builder.
• Must-haves: Apply an Observability Mindset: Implement a clear strategy for metrics, logs, and traces. Transform 'noisy alerts' into 'actionable insights.'
• Embrace Ownership: Live the 'you build it, you run it' philosophy. Stop the ticket ping-pong and end the excuses.
• Nice-to-haves: Design and evolve production-grade setups on GCP or AWS.
• Show us your contributions to open-source projects.
• Turn your passion for root-cause analysis into blameless post-mortems.
Benefits
•
• Fitness Offers
• Fresh Fruit
• Jobbike
• Coffee, Tea, etc.
•
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free