Site Reliability Engineer ensuring reliability and scalability of fintech applications. Developing automation solutions to optimize performance and support critical systems in a hybrid role.
Responsibilities
Ensure the reliability and scalability of software applications.
Develop and manage automation scripts and tools to improve system performance and efficiency.
Monitor application performance and troubleshoot issues to ensure high availability and reliability.
Collaborate with development teams to ensure best practices for deployment and operations.
Conduct root cause analysis of incidents and implement corrective actions.
Participate in on-call rotations to provide 24/7 support for critical systems.
Identify and automate repetitive tasks to reduce manual toil and errors.
Contribute to and maintain design and process documentation.
Build and configure observability and Application Performance Management (APM) tools.
Understand, champion, and enforce security and compliance policies and procedures adhering to frameworks like PCI, NIST, CIS, etc.
Continually seek opportunities to improve SLA/Uptime and minimize customer impacts.
Requirements
Proven experience as an Application Support Engineer or similar role, with leadership experience.
Strong knowledge of automation tools and scripting languages.
Ability to code automation using a structured programming language like Python.
Proficiency in Linux.
Broad knowledge of the architecture of enterprise-level information technology building blocks (e.g., Networking, Databases, Messaging, RBAC, etc.).
Understanding of internet technologies and microservice-based architecture (e.g., Web servers, encryption, XML, HTTP, Web Services, APIs).
Excellent problem-solving skills and attention to detail.
Strong communication and collaboration skills.
Focus on scalability, high availability, performance, resiliency, and reliability of software applications.
Bachelor's degree in Computer Science, Engineering, or a related field or relevant experience.
(Senior) DevOps Engineer at Wavestone developing and operating complex software solutions for digitalization projects. Collaborating in teams and contributing to technology landscape advancements.
Reliability Engineer focused on the dependability and mission success of complex space systems. Involvement includes analyses, collaboration, and adherence to aerospace reliability standards.
DevOps Engineer automating IT processes at Maurer Electronics GmbH in Hannover. Engaging in continuous integration and development with team collaboration and innovative solutions.
DevOps Engineer working with IT Security Team in Berlin, developing and supporting complex IT Security Services. Collaborating on automated IT - Security - Services with cutting - edge technologies and methodologies.
DevOps Engineer focusing on deploying high - security on - prem infrastructure and MLOps platforms for mission - critical systems. Collaborating on Kubernetes - based orchestration and machine learning workloads.
Cloud Site Reliability Engineer managing Solace Cloud services across leading cloud providers. Ensuring reliability, handling incidents, and collaborating with customers for operational excellence.
Senior Cloud Site Reliability Engineer ensuring reliability and health of Solace Cloud Services with hands - on cloud operations expertise. Lead incident management and customer support for high - impact environments.
DevOps Engineer designing and operating AWS infrastructure within industrial IoT environments. Working on systems that ensure security, resilience, and end - to - end observability.