Site Reliability Engineer Intern

Interintel

Kenya

Internship
Published March 04, 2026
Apply by March 09, 2024

Job Description

The intern will assist in designing, automating, and monitoring reliable, scalable platforms, handling incident response, infrastructure as code, and observability using tools like Prometheus, Terraform, and Grafana.

Key Responsibilities

  • Assist in design, implement, and continuously improve system reliability, availability, and performance by ensuring adherence to defining and monitoring SLIs, SLOs, and error budgets across all assigned platforms.
  • Support in building and managing a robust monitoring and observability framework using Prometheus, Grafana, and Loki to track latency, traffic, errors, system health, and user impact.
  • Assist in automating infrastructure provisioning, scaling, and configuration management using infrastructure as Code principles with Terraform and Ansible to ensure consistency, scalability, and disaster recovery readiness.
  • Participate in incident response processes, including detection, escalation, resolution, communication, and conducting blameless postmortems to prevent recurrences.
  • Assist in reducing manual operational workload through automation, observability, and process optimization to improve efficiency and release velocity.
  • Support in ensuring high availability and performance of business-critical systems.
  • Collaborate with Engineering, Product, and DevOps teams to assist in improving deployment velocity, capacity planning, cost optimization, and system availability.
  • Assist in establishing alerting strategies and reliability standards that minimize alert fatigue while ensuring rapid detection and resolution of production issues.

Requirements

  • Bachelor's Degree in Computer Sciences, Information Technology, or a related field.
  • Basic hands‑on exposure to monitoring and metrics systems such as Prometheus.
  • Some exposure in Kubernetes and Cloud networking.
  • Some experience with monitoring and observability tools such as Datadog.
  • Good exposure managing production systems in cloud environments.
  • Some exposure in implementing and managing CI/CD pipelines and utilizing tools like Jenkins, GitLab CI/CD, or equivalent.
  • Basic familiarity with containerization tools such as Docker.
  • Foundational understanding of log aggregation systems such as ELK.
  • Familiarity with Linux environments and basic system commands.
  • Exposure to scripting concepts using Python, Bash, or similar languages.
  • Foundational knowledge of Artificial Intelligence (AI) and good exposure with AI agents; relevant certifications in AI or related disciplines will be an added advantage.

How to Apply

Send your resume and cover letter with subject SITE RELIABILITY ENGINEER INTERN to recruiting@interintel.co.ke

Ready to Apply?

Contact the company directly to apply for this position.

Job Summary
Company: Interintel
Location: Kenya
Type: Internship
Posted: 03/04/2026
Deadline: 03/09/2024
Related Jobs
Auto Apply