Skip to content
mimi

Junior Site Reliability Engineer

Signa Opportunity

South Africa · On-site Contract Entry Level 1w ago

About the role

About the role The position is responsible for contributing to the reliability, scalability, and performance of the company’s cloud-native infrastructure and production services.

Responsibilities • System Monitoring & Observability • Configure and maintain monitoring tools (e.g., Prometheus, Datadog) to track key system metrics (latency, traffic, errors, saturation). • Create and refine dashboards and alerts to ensure rapid detection of anomalies and potential outages. • Assist in the implementation of distributed tracing and structured logging to improve debugging and performance analysis. Incident Response & Management • Participate in a 24/7 on-call rotation as a secondary responder, escalating issues as needed to senior team members. • Follow incident response playbooks to diagnose and mitigate production incidents, aiming to restore service within defined SLOs. • Contribute to blameless post-incident reviews by documenting timelines, root causes, and action items to prevent recurrence. Automation & Infrastructure as Code • Develop and maintain automation scripts (Python, Go, or Bash) to streamline repetitive operational tasks such as certificate rotation, user access management, and log rotation. • Assist in managing cloud infrastructure using IaC tools (Terraform, CloudFormation) to ensure consistent, version-controlled, and repeatable deployments. • Support CI/CD pipeline improvements (GitLab CI, GitHub Actions, Jenkins) to enable safe and efficient application deployments. Capacity Planning & Performance Tuning • Collect and analyse resource usage trends (CPU, memory, storage, network) to help forecast capacity needs and recommend scaling actions. • Work with development teams to conduct load testing and identify performance bottlenecks. Collaboration & Knowledge Sharing • Partner with software engineers to implement service level indicators (SLIs) and define realistic service level objectives (SLOs). • Document system architecture, operational runbooks, and common troubleshooting steps to empower the wider team. • Actively participate in team agile ceremonies, providing input on reliability risks for upcoming features. Beneficial Skills (Desired Skills): • Container Orchestration: Hands-on experience with Kubernetes (cluster administration, Helm charts, pod autoscaling) or Docker Swarm. • Programming & Scripting: Proficiency in at least one high-level language (Python, Go) for automation and tooling; comfort with shell scripting. • CI/CD Pipelines: Familiarity with building and maintaining deployment pipelines, including canary deployments, feature flags, and rollback strategies. • Observability Stack: Experience with Prometheus, Grafana, Loki, Tempo, or the ELK Stack (Elasticsearch, Logstash, Kibana). • Networking: Working knowledge of load balancers (NGINX, HAProxy), DNS management, firewalls, and TCP/IP troubleshooting. • Cloud Platforms: Exposure to AWS (EC2, EKS, RDS, S3) or equivalent cloud provider services. • Security Best Practices: Understanding of identity and access management (IAM), secrets management (Vault, AWS Secrets Manager), and basic security hardening. Minimum Requirements • South African Unemployed youth between the ages of 18 and 34. • Must not have participated on the YES programme before. • Matric. • Bachelor’s degree in Computer Science, Information Technology, Engineering, or a related field; or equivalent demonstrable experience in a systems or software engineering role. Certifications & Licenses (Desired but not required): • AWS Certified Solutions Architect – Associate or equivalent cloud certification. • Certified Kubernetes Administrator (CKA) or CKAD. • Any SRE-related or DevOps training certifications. Technical Fundamentals: • Solid understanding of Linux/Unix operating systems (systems, filesystems, process management, networking stack). • Familiarity with at least one cloud provider (AWS, GCP, or Azure) and its core compute, storage, and networking services. • Basic understanding of version control systems (Git) and collaborative development workflows (pull requests, code reviews). Personal Characteristics: • Problem-solving mindset: Able to break down complex issues into manageable components and systematically debug under pressure. • Ownership: Takes responsibility for tasks from inception to completion, with a focus on quality and reliability. • Curiosity: Passionate about learnin

Don't send a generic resume

Paste this job description into Mimi and get a resume tailored to exactly what the hiring team is looking for.

Get started free