Site Reliability Engineer - Datafin IT Recruitment
Pretoria/Centurion – Gauteng 13 days ago Permanent Salary - Market Related
Site Reliability Engineer
Datafin IT Recruitment
Pretoria/Centurion – Gauteng
Date Created : 13 days ago
Job Type : Permanent
Salary : Market Related
- Designing, implementing, and maintaining CI/CD pipelines for Kubernetes-based applications.
- Automating deployment processes and ensuring continuous integration and delivery of software.
- Implementing monitoring solutions for infrastructure and applications using tools such as Prometheus, Grafana, and Kubernetes-native monitoring.
- Generating reports on system performance, availability, and reliability.
- Analysing logs and metrics to identify trends, anomalies, and performance issues.
- Implementing log aggregation and analysis solutions like ELK Stack or Splunk.
- Investigating and resolving issues related to application performance, availability, and reliability in Kubernetes environments.
- Collaborating with development teams to diagnose and debug complex issues.
- Setting up alerting mechanisms to proactively detect and respond to incidents.
- Escalating critical issues to appropriate teams and stakeholders.
- Managing and maintaining Linux servers, including installation, configuration, and patch management.
- Implementing security measures and best practices for Linux-based systems.
- Managing user accounts, groups, and permissions in Active Directory.
- Performing routine maintenance tasks and ensuring the security of AD infrastructure.
- Configuring and managing DNS servers and zones.
- Troubleshooting DNS-related issues and ensuring DNS resolution reliability.
- Providing technical support and assistance to end-users for infrastructure-related issues.
- Resolving hardware, software, and connectivity problems promptly.
- Managing PostgreSQL databases, including installation, configuration, and performance tuning.
- Performing routine maintenance tasks such as backups, restores, and upgrades.
- 3+ years of experience in a Site Reliability Engineer role or similar position.
- Proficiency in Kubernetes administration and experience with CI/CD pipelines.
- Strong Linux administration skills, including shell scripting and troubleshooting.
- Experience with monitoring and logging tools such as Prometheus, Grafana, ELK Stack, or Splunk.
- Familiarity with Active Directory administration and DNS management.
- Experience with PostgreSQL database administration is a plus.
- Excellent communication and problem-solving skills.
- Ability to work effectively in a fast-paced, collaborative environment.
While we would really like to respond to every application, should you not be contacted for this position within 10 working days please consider your application unsuccessful.
By applying to a job using RecruitmentPartner, you are agreeing to comply with and be subject to RecruitmentPartner Terms for use of our website.
By applying to a job using RecruitmentPartner, you are agreeing to comply with and be subject to RecruitmentPartner Terms for use of our website.