Site Reliability Engineer ensuring reliability and scalability of fintech applications. Developing automation solutions to optimize performance and support critical systems in a hybrid role.
Responsibilities
Ensure the reliability and scalability of software applications.
Develop and manage automation scripts and tools to improve system performance and efficiency.
Monitor application performance and troubleshoot issues to ensure high availability and reliability.
Collaborate with development teams to ensure best practices for deployment and operations.
Conduct root cause analysis of incidents and implement corrective actions.
Participate in on-call rotations to provide 24/7 support for critical systems.
Identify and automate repetitive tasks to reduce manual toil and errors.
Contribute to and maintain design and process documentation.
Build and configure observability and Application Performance Management (APM) tools.
Understand, champion, and enforce security and compliance policies and procedures adhering to frameworks like PCI, NIST, CIS, etc.
Continually seek opportunities to improve SLA/Uptime and minimize customer impacts.
Requirements
Proven experience as an Application Support Engineer or similar role, with leadership experience.
Strong knowledge of automation tools and scripting languages.
Ability to code automation using a structured programming language like Python.
Proficiency in Linux.
Broad knowledge of the architecture of enterprise-level information technology building blocks (e.g., Networking, Databases, Messaging, RBAC, etc.).
Understanding of internet technologies and microservice-based architecture (e.g., Web servers, encryption, XML, HTTP, Web Services, APIs).
Excellent problem-solving skills and attention to detail.
Strong communication and collaboration skills.
Focus on scalability, high availability, performance, resiliency, and reliability of software applications.
Bachelor's degree in Computer Science, Engineering, or a related field or relevant experience.
DevOps Engineer at Vodafone Romania delivering resilient infrastructure for software development lifecycle. Collaborating with Digital Squads and optimizing CI/CD pipelines for efficient deployments.
Mechanical/Reliability Engineer responsible for mechanical installations in Bergen op Zoom. Analyzing maintenance strategies and leading projects to enhance reliability.
Senior DevOps Engineer responsible for cloud infrastructure and deployments. Optimizing AWS services and ensuring system security and reliability for Verizon.
Senior DevOps Engineer responsible for automating infrastructure and building CI/CD pipelines for collaborative robotics company. Collaborating with global engineering teams from the Bangalore office.
Site Reliability Engineer Intern at Tencent working on gaming services and cloud native solutions. Collaborating with global teams to eliminate toil and enhance reliability.
Cloud/DevOps Specialist at N5X managing and optimizing critical cloud infrastructures for Brazilian energy trading. Collaborating with a multidisciplinary team to ensure high availability and performance.
Cloud/Devops Specialist responsible for designing a hybrid architecture combining cloud and on - premises infrastructure for energy trading systems. Collaborating with a multidisciplinary team in a dynamic environment.
Reliability Engineering Specialist utilizing reliability tools and models to improve asset performance at Enbridge. Collaborating across teams to guide investment decisions for safe operations.
DevOps Engineer responsible for structuring and supporting cloud DevOps architecture in Brazil. Working strategically on automation and CI/CD practices with development teams in Pernambuco.
DevSecOps Software Engineer developing secure CI/CD pipelines for Boeing's military software systems. Collaborate with cross - functional teams and implement automation and security best practices.