Senior Site Reliability Engineer managing cloud infrastructure for SaaS solutions at PROS Holdings. Focusing on reliability, automation, and team collaboration in a hybrid work environment.
Responsibilities
Design, implement, and maintain secure, scalable infrastructure across cloud environments
Analyze cloud environment requirements from various sources, document system designs, and implement necessary modifications
Automate repetitive system tasks and manage system-related activities for internal and external clients, including Professional Services support
Ensure system reliability through robust failover mechanisms, disaster recovery processes, and 24/7 support strategies
Design, implement, and improve monitoring tools to meet SLOs, ensuring a “Monitor by Design” approach is adopted across product teams
Continuously drive reliability improvements through proactive initiatives, data-driven SLO adjustments, and advanced monitoring/alerting solutions
Lead and coordinate disaster recovery testing exercises and capacity planning to enhance system reliability
Identify and reduce operational toil through automation and tool development
Apply and enforce security best practices across cloud environments, while mentoring team members on SLO achievement
Facilitate cross-team communication, provide training, and maintain clear documentation (e.g., runbooks and procedures)
Support cloud environment management and propose technology changes to improve performance and reliability.
Requirements
7+ years of experience as a System Administrator, DevOps Engineer, SRE, or similar role
Deep knowledge of Linux administration, including performance monitoring, tuning and troubleshooting
Experience with cloud network design (Azure preferred, AWS or GCP also considered)
Proficiency in scripting (e.g., Bash, Python) for automation
Experience with version control software (preferably Git)
Experience with configuration management tools (e.g., Puppet, Foreman, Ansible, or similar)
Knowledge of container orchestration tools (e.g., Kubernetes, Docker Swarm, etc.)
In-depth knowledge of monitoring and logging solutions for cloud infrastructure (e.g., Prometheus, Grafana, etc.)
Bachelor’s degree in Computer Science or a related field
Excellent time management, organizational, crisis management, and problem-solving skills
Self-starter, able to work independently without direct supervision
Willingness to innovate, learn, and share knowledge
Excellent verbal and written communication skills
Experience developing and implementing IT security best practices and procedures
Willingness to participate in on-call rotations and respond to incidents in a timely and effective manner
Senior DevOps Engineer managing DevOps processes and tooling for customer - facing platforms at Luminor. Building CI/CD pipelines and providing production support with a focus on mentoring and collaboration.
Building and maintaining DevOps processes and CI/CD pipelines for Luminor's banking champion. Collaborating in a flexible work environment with international teams.
Senior DevOps Engineer at Luminor, a leading bank in the Baltics, managing customer - facing platforms and infrastructure. Building CI/CD pipelines and mentoring junior engineers.
Sr. Site Reliability Engineer designing and automating robust technical infrastructure at Broadridge. Collaborating across teams for successful deployment and operational support of services.
Senior Fleet Reliability Engineer maintaining high fleet uptime for autonomous vehicle technology. Collaborating with technical teams to ensure peak operational performance in data collection efforts.
DevOps Lead at Leidos managing platform engineering, SRE, and application security functions. Driving operational excellence and ensuring scalability for federal government applications.
SRE Lead developing scalable cloud - native solutions for mission - critical systems supporting USAF. Managing teams, collaborating with cross - functional units, and ensuring high service reliability standards.
Junior DevOps / Platform Engineer at DieEnergiekoppler GmbH managing AWS/EKS platform operations. Collaborating with team members to improve platform functionalities and security compliance.
DevOps Engineer responsible for AWS infrastructures and backend development at Allguth GmbH. Engaging in greenfield projects with modern solutions in a collaborative team.
Cloud DevOps Specialist responsible for building scalable infrastructure solutions in AWS at SONDA. Focusing on automation, containerization, and data management in a collaborative environment.