Senior Site Reliability Engineer at T-Mobile enhancing system reliability and resilience while facilitating software development and deployment. Ensures performance of Network Supply Chain ecosystem including multiple systems.
Responsibilities
Enhancing system reliability and resilience
Facilitating faster and more efficient software development and deployment
Automating processes to reduce manual effort and prevent operational incidents
Owning production as well as non-production environments
Continuously learning new skills and technologies
Driving the vision to ensure stability and security
Proposing optimal solutions for efficiencies
Coaching engineers and junior team members on best practices
Participating in advanced troubleshooting of production and pre-production systems
Collaborating on architectural, technological, and infrastructural discussions
Requirements
Bachelor's Degree in Computer Science, Engineering or related field (Preferred)
Master's/Advanced Degree in Computer Science, Engineering or related field (Preferred)
4-7 years working in operations or developing environments
4-7 years troubleshooting customer related issues and managing customer relationships
4-7 years developing software solutions using Python or similar programming languages
4-7 years of progressive experience in software engineering/maintenance across multiple products, systems and/or platforms coupled with strong business acumen
4-7 years of experience in Enterprise applications, middle-tier services, database, storage, distributed computing, virtualization and/or application technology.
Experience working in an Agile and DevOps environment.
Experience in one or more of: JavaScript, Java, .Net, API Gateway, MongoDB, Oracle, Springboot, Angular, etc.
Experience in Continuous Integration/Continuous Delivery tools, such as, Jenkins, Cloudbees, etc., and other automation tools.
Experience with DevOps tools, such as, Ansible, Chef, Puppet, etc.
Experience in Docker, Kubernetes, and Deep.io is preferable.
Experience in APM tools, like AppDynamics, oTel, and Splunk.
Incident Management Understanding of incident response management and operational support (Required)
Experience with designing and maintaining CICD Pipelines (Required)
Benefits
Medical, dental and vision insurance
Flexible spending account
401(k)
Employee stock grants
Employee stock purchase plan
Paid time off and up to 12 paid holidays
Paid parental and family leave
Family building benefits
Back-up care
Enhanced family support
Childcare subsidy
Tuition assistance
College coaching
Short- and long-term disability
Voluntary AD&D coverage
Voluntary accident coverage
Voluntary life insurance
Voluntary disability insurance
Voluntary long-term care insurance
Mobile service & home internet discounts
Access to commuter and transit programs
Job title
Senior Engineer, Site Reliability – Network Supply Chain
Senior Reliability Engineer at Sonova ensuring dependable performance of hearing solutions for millions of users globally. Involves engineering skills to improve product reliability across development stages.
Equipment and Reliability Engineer at Chobani responsible for improving asset efficiency, redesigning equipment. Collaborating with Operations to solve complex problems and lead projects in a team environment.
Reliability Engineer II focused on enhancing safety, efficiencies, and cost controls at Freeport - McMoRan mining operations. Collaborating with multiple teams and managing engineering projects.
Reliability Engineer I responsible for equipment failure analysis and improvement recommendations at Freeport - McMoRan's copper smelting operations. Ensuring uninterrupted production and managing equipment health through data analysis.
Designing, building, and maintaining the Kubernetes - based developer platform for Schwarz IT Barcelona. Collaborating with engineering teams to enhance services in Azure and Google Cloud.
Database Reliability Engineer managing MySQL database infrastructure at PointClickCare. Collaborating with Engineering and SRE teams for product development and reliable integration across the platform.
Teamleitung in der Gebäudereinigung in Grimma, verantwortliche Planung, Organisation und Führung des Reinigungsteams. Aktive Mitarbeit und Einhaltung von Hygiene - und Qualitätsstandards sind erforderlich.
Service Reliability Engineer providing technical support and managing incidents for BT International. Ensuring system availability and collaboration with global stakeholders to achieve objectives.
Studying Bachelor of Arts in Accounting, Taxation, and Economic Law while gaining practical experience in a dynamic team. Benefit from a diverse working day and continuous development opportunities.
Technical Trainer conducting workshops and training sessions on MERKUR Group's product content for diverse audiences. Engaging with employees and clients to ensure smooth product operation and understanding.