Senior SRE driving incident management and operational excellence in financial software solutions. Working with innovation and technology in Brazil's leading software company's team.
Responsibilities
Lead high-impact incidents end-to-end, including investigation, mitigation, service recovery and technical communication.
Resolve problems without ready-made solutions or documentation, investigating deep root causes and proposing definitive fixes.
Make decisive technical decisions under pressure, evaluating risks and impacts in production environments.
Create, review and standardize operational playbooks and incident response procedures.
Evolve reliability, resilience and observability practices for the operation.
Operate across production, staging and development environments, ensuring continuous availability.
Identify automation opportunities and improve operational workflows.
Provide technical support to internal and external customers with a consultative approach and clear communication.
Participate in technical calls and meetings with customers to analyze, update and drive issue resolution.
Act as a mentor, guiding less experienced professionals during incidents and promoting best practices.
Requirements
Bachelor's degree in a Technology-related field.
Strong experience in advanced troubleshooting, including:
– Deep log analysis;
– Validation of complex environments;
– Structured evidence collection;
– Investigation of critical incidents;
Experience supporting high-criticality production environments.
Proven ability to document, communicate and lead technically during incidents.
Benefits
Meal allowance or food card;
Flexible Benefit (Flash);
Health insurance;
Partners for psychological, legal, financial and nutritional support (CLUDE, C4LIFE and ASQ);
Psicologia Viva;
Dental assistance;
Childcare assistance;
Support for children with special needs;
Fertility treatment assistance;
Extended maternity and paternity leave;
Commuter allowance or Home Office allowance (for telework contracts);
Gympass (Wellhub) and TotalPass;
Flexible working hours;
Life insurance;
Partner discounts club;
Partnership with Sesc;
No dress code (casual dress);
Day off on your birthday;
Beca (education incentive program);
PPR or Bonus — based on achievement of targets and results.
Senior Site Reliability Engineer maintaining reliability and user experience of AI services for Woven by Toyota. Collaborating with engineering teams to ensure service availability and performance.
DevOps Specialist supporting the engineering and operational enablement of next - gen data center platforms at KONE. Involves Infrastructure - as - Code deployments and daily DevOps workflows.
GitHub Enterprise Specialist managing KONE's GitHub ecosystem, ensuring secure and scalable workflows. Collaborating with teams to enhance developer productivity through AI - powered capabilities.
Senior Software Engineer responsible for designing microservices and enhancing LLM performance for Fortanix's Generative AI platform. Collaborating with data science and ML Infrastructure teams for security and optimization.
Reliability Engineering Technician conducting various verification tests and collaborating with reliability engineers. Preparing technical documentation in a well - equipped laboratory environment in Poland.
Reliability Engineer ensuring quality and reliability of products. Conducting various verification tests in a well - equipped laboratory in Mierzyn, Poland.
Salesforce DevOps Engineer focused on CI/CD pipeline management for Salesforce at S&P Global Mobility. Collaborating with cross - functional teams to ensure stable and secure releases.
Senior DevOps Engineer designing and building infrastructure for AI workloads across cloud and edge environments. Collaborating with engineering teams to implement scalable, automated solutions.
Mid - level Site Reliability Engineer at WEX managing Azure Cloud systems and driving reliability practices. Collaborating with teams to enhance performance, reduce toil and automate processes.
Reliability Engineer II improving efficiencies and safety in copper mining operations at Freeport - McMoRan. Developing recommendations for engineering projects and collaborating with Operations and Maintenance teams.