Location
winnipeg
Posted
June 05, 2026
Commute
Local Area
Local Opportunity Near You!
This job is in your area. Enjoy a short commute and work close to home.
Job Description
Become a key player as a L1 Site Reliability Engineer, focusing on operational tasks across enterprise applications. Your expertise in Kubernetes, APIs, and multi-cloud environments is essential for incident management and resolution.
In this role, you will handle monitoring, triaging, and executing crucial tasks using advanced tools like Grafana and Datadog. With 2-5 years in IT operations or DevOps, youβll support automation and improve incident response processes while ensuring systems are healthy and operational standards are met.
Key Responsibilities:
β’ Monitor systems with Grafana and Datadog for anomalies
β’ Execute predefined runbooks for incident resolution
β’ Collect logs and system data for analysis
β’ Troubleshoot issues using kubectl and automation scripts
β’ Document incident resolution steps and improvements
Requirements:
β’ 2-5 years in IT operations or SRE roles
β’ Proficient in Linux and Kubern...
In this role, you will handle monitoring, triaging, and executing crucial tasks using advanced tools like Grafana and Datadog. With 2-5 years in IT operations or DevOps, youβll support automation and improve incident response processes while ensuring systems are healthy and operational standards are met.
Key Responsibilities:
β’ Monitor systems with Grafana and Datadog for anomalies
β’ Execute predefined runbooks for incident resolution
β’ Collect logs and system data for analysis
β’ Troubleshoot issues using kubectl and automation scripts
β’ Document incident resolution steps and improvements
Requirements:
β’ 2-5 years in IT operations or SRE roles
β’ Proficient in Linux and Kubern...