Site Reliability Engineer responsible for monitoring and improving production systems at ING. Leading teams to ensure high reliability and performance of business-critical applications.
Responsibilities
Monitor & troubleshoot production systems
Lead and mentor a growing team
Drive observability, incident/problem management, and root cause analysis
Plan capacity and assess system health
Manage SLAs and hold teams accountable
Automate operations using bots
Coordinate disaster recovery and business continuity planning
Generate production reports
Requirements
Understanding and applying Site Reliability Engineering principles
Experience in designing and operating highly available systems
Hands-on experience with containerization and observability tools
Expertise in Linux/Unix, Tomcat, networking, and RDBMS
Proficiency in Windows systems administration, ASP.NET/.NET Core apps on IIS
Define and work with SLIs, SLOs, and SLAs
Strong skills in monitoring, observability, and incident management
Scripting or coding in PowerShell and familiarity with CI/CD pipelines
Experience with Agile platforms and rapid scaling
Familiarity with ELK stack, Prometheus + Grafana, and enterprise scheduling tools
Understanding of distributed systems and microservices architecture
Reliability Engineer providing support to maintenance and operations teams for critical gold processing assets. Ensuring equipment reliability and leading improvement initiatives at Gruyere Gold Mine.
Staff Reliability Engineer at insurance company enhancing stability and performance of systems. Collaborating across teams to implement best practices and mentor others in reliability engineering.
Reliability Engineer at Mosaic Company providing in - depth analysis on mechanical systems to reduce risk. Supporting operations in reliability improvement initiatives across refinery and minefield.
DevOps Engineer at MYOB enhancing core business management systems for small to medium enterprises in Australia and New Zealand. Focused on operational excellence and stability.
(Senior) DevOps Engineer automating IT processes and managing CI/CD for digital solutions in a technology company. Collaborating with product owners and engineering teams to ensure secure digital solutions.
Azure Cloud Operations Engineer optimizing the cloud infrastructure in Vienna for innovative work management software. Collaborate on cloud solutions with a dynamic international team.
Join CI&T as a DevOps Master in technology transformation involving a corporate developer platform. Collaborate closely with teams to enhance scalability and operational efficiency.
DevOps Engineer responsible for developing and operating CI/CD pipelines in hybrid environments. Join K - tronik to work on innovative software and hardware projects within a dedicated team.
Senior SRE managing reliability of 300+ servers powering client Odoo ERP systems. Lead incident response and guide a team in building reliable systems.