Production Support Engineer at Tala supporting platform availability across global markets. Collaborating with engineering teams for incident response and process improvements in a remote-first approach.
Responsibilities
Support, monitor and maintain the high availability of our platform across all of our markets
Collaborate with engineering, CX, product and program teams globally for production incident response and post-mortem processes
Work closely with the CX and Collections teams to discover areas of improvement for our product based on their feedback and customer communication
Continuously review and improve existing monitoring and alerting systems
Requirements
2+ years of experience working in technology environment with experience in microservices architecture
1+ years of experience in incident response or similar role
Experience working with a remote team in a global environment
Knowledge of various monitoring platforms such as AWS CloudWatch, SumoLogic, APM monitoring (NewRelic, Instana), mobile (Crashlytics data), BI (Looker, Snowflake)
Knowledge of relational databases, BI querying languages to be able to construct queries during investigations
Experience working with tools like Postman, or scripting API queries
Excellent debugging and documentation skills
Ability to coordinate incident response and communicate effectively with stakeholders from variety of teams across different timezones
Ability to remain calm under pressure during a production incident resolution
Candidates with QA, SRE or similar background are encouraged to apply
Site Reliability Engineer ensuring stability and security for ShiftKey’s Marketplace platform while executing AWS migration. Blends maintenance with engineering in a collaborative environment.
Production Engineer designing customer - oriented manufacturing concepts at Festo. Responsibilities include process development, documentation review, and collaboration with international teams.
Experienced Production Engineer supporting quality - critical processes and collaborating with teams to ensure high - quality pen needles. Engaging in stable operations and improvements within a 2 - year temporary contract.
Production Support Engineer ensuring system stability and reliability for Manulife's critical services. Collaborative role bridging development and infrastructure, providing seamless service for customers.
Senior Production Engineer (SRE) at Legion building and operating a secure AWS/Kubernetes platform. Focused on automation, reliability, and infrastructure as code.
Production Engineer managing database operations at Palantir, ensuring reliability and availability of data systems. Involved in architecture, design, and maintenance of production databases in various environments.
Production Engineer PCB managing first - line technical support for PCB assembly processes. Assisting with product introduction and implementing process improvements in a leading transport solutions company.
Senior Production Support / DevOps Engineer at Keyrus focusing on application reliability and cloud operations. Support enterprise Java - based platforms in collaboration with development teams.
Lead Production Engineer managing production optimization initiatives across the enterprise for oil and gas. Act as the key authority in autonomous and semi‑autonomous production engineering standards.