1I
Staff Storage Reliability Engineer
1&1 IONOS SE
Hybrid Full-time Lead Today
About the role
About
Staff Storage Reliability Engineer for our global, public object storage platform based on Ceph. You will develop solutions that scale in production and work with our operations team to build, deploy, and maintain the platforms.
We are currently in the double-digit petabyte range, distributed across multiple locations. Our Ceph platform is growing rapidly and is a critical component of our internal and public infrastructure. You will actively contribute to further development, improve and maintain our production environments, and ensure that availability, performance, and security are maintained while scaling.
Responsibilities
Ceph Deployment and Management:
- Deploy, configure, and operate Ceph clusters – including managing storage pools, placement groups, and other core components.
Performance Optimization:
- Tune Ceph for optimal performance, eliminate bottlenecks, and ensure efficient resource utilization.
Automation:
- Develop and implement automation strategies for Ceph deployments, upgrades, and maintenance tasks.
Troubleshooting and Problem Solving:
- Diagnose and resolve complex technical issues related to Ceph storage – often in collaboration with other teams.
Collaboration:
- Collaborate closely with development teams, system administrators, and other stakeholders to integrate Ceph into various systems and applications.
Staying Up-to-Date:
- Follow the latest Ceph developments, new features, and best practices.
- Actively participate in the Ceph community and share knowledge.
Qualifications
- 5+ years of experience as a Senior Linux Engineer or Site Reliability Engineer; deep and broad understanding of Linux systems and networks.
- In-depth knowledge of Ceph architecture and administration.
- Experience with cloud storage technologies (File, Object, Block).
- Hands-on experience with automation tools (e.g., Ansible) and monitoring and observability solutions.
- Familiarity with cloud platforms and container technologies (e.g., Docker).
- Excellent troubleshooting and problem-solving skills, strong communication and collaboration skills.
Benefits
- Hybrid work model.
- Flexible working hours with trust-based working time.
- Subsidized canteen and various free drinks at some locations.
- Modern office spaces with excellent public transport connections.
- Various employee discounts for activities and products.
- Employee events such as summer and winter parties, as well as workshops.
- Numerous further training and development opportunities.
- Various health offers, such as sports and health courses.
Skills
AnsibleCephDockerLinuxObject Storage
Don't send a generic resume
Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.
Get started free