Site Reliability Engineer focused on automation and optimization of software application performance. Collaborating with cross-functional teams to enhance scalability and reliability in Chennai/Bangalore.
Responsibilities
Understanding project KPIs, SLI's, SLO's, MTTD, MTTR, Error budgets, Chaos engineering and eliminating TOILs by automation
Exploring observability tools and creating/implementing dashboards
Run the production environment by monitoring availability and taking a holistic view of system health
Incident Management: Knowledge in handling incidents, participating in blameless postmortem, performing root cause analysis, and implementing post-incident reviews
Develop scripts to reduce toil and automate repetitive tasks, issues resolution scripting
Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement
Requirements
6 to 9 yrs of experience in supporting Unix/Linux/Windows based application environments
Exposure to latest SRE, Cloud, DevOps technologies
Knowledge of Containers, Dockers, Kubernetes/OpenShift tools
Worked on/with System and Application Monitoring and Observability tools – Splunk, Prometheus, Grafana, Dynatrace
Hands on experience in preparing PowerShell/Python/Shell script automation
Knowledge of any RDBMS/NoSql
Good knowledge of application support domain
Experience with 3rd party tools Management
Skills in using tools like Terraform & Ansible to automate infrastructure management.
Benefits
Competitive salary and a fantastic range of benefits designed to help support your everyday life and wellbeing
Time to support charities and your community
A work environment built on collaboration, flexibility, and respect
An expansive range of professional education and personal development possibilities – FIS is a final career step!
Job title
AWS DevOps Production Support, Unix, Shell Scripting, Splunk/Dynatrace, Kubernetes/OpenShift
DevOps Engineer at Aifano GmbH developing AI - driven enterprise solutions. Involves CI/CD pipeline management, cloud infrastructure setup, and collaboration with development teams.
Lead Infrastructure Engineer at U.S. Bank responsible for managing and configuring cloud systems and infrastructure technologies while promoting automation practices.
Site Reliability Engineer ensuring the availability and performance of services for autonomous vehicle operations. Collaborating on system design and automation in a robotics - focused environment.
DevOps Engineer automating continuous deployment and monitoring on AWS for Crown Equipment Corporation. Bridging developers, IT, and external providers for operational efficiency.
Senior DevOps Engineer responsible for leading CI/CD pipeline design and optimization. Collaborating with teams to drive DevOps maturity across the enterprise while managing infrastructure automation.
Cloud Operations Engineer ensuring reliable performance of cloud systems at 2Innovate. Focused on automation, incident management, cloud security, and infrastructure monitoring in cloud environments.
AWS DevOps Engineer responsible for delivering scalable digital experiences for EXL's MarTech ecosystem. Engaging in development, maintenance, and collaboration across stakeholders and services.
Senior Site Reliability Engineer managing critical infrastructure at Hornetsecurity. Collaborating with product teams to ensure performance and reliability across services.
Site Reliability Engineer enhancing platform reliability for AI workflows at WRITER. Overseeing automated solutions and cloud infrastructure supporting high - trafficked AI systems.