SRE Specialist responsible for enhancing service availability and reliability at Morgan Stanley, collaborating with engineering teams and troubleshooting system issues.
Responsibilities
Working closely with engineering/development teams to design, build, and maintain systems.
Troubleshoot issues across the entire stack: hardware, software, application and network.
Identifying and drive opportunities to improve automation for our platforms.
Proactively identifying and addressing systems reliability risks.
Represent the RPE organization in design reviews and operational readiness exercises for new and existing services.
Participate in on-call rotation and periodic conference calls with other specialists from other time zones.
Requirements
At least 4 years of experience in a SRE role
Background in Computer Science equivalent to a B.Sc.
Automation-related experience using scripting languages such as python, bash, Perl.
One higher level language is desired.
Experience on supporting three tier architecture including exposure to UNIX, Linux platforms and databases such as IBM DB2, Sybase, Mongo, GreenPlum.
Experience with source code and binary repositories, build tools, and CI/CD (Git, Artifactory, Jenkins, Docker) and data streaming technologies like Spark, Kafka.
Hands on experience on enterprise tools set such as Grafana, Dynatrace, AppDynamics.
Deep understanding of operating system level concepts such as processes, memory allocation, and the network stack; ability to debug same.
Senior Reliability Engineer at Sonova ensuring dependable performance of hearing solutions for millions of users globally. Involves engineering skills to improve product reliability across development stages.
Equipment and Reliability Engineer at Chobani responsible for improving asset efficiency, redesigning equipment. Collaborating with Operations to solve complex problems and lead projects in a team environment.
Reliability Engineer II focused on enhancing safety, efficiencies, and cost controls at Freeport - McMoRan mining operations. Collaborating with multiple teams and managing engineering projects.
Reliability Engineer I responsible for equipment failure analysis and improvement recommendations at Freeport - McMoRan's copper smelting operations. Ensuring uninterrupted production and managing equipment health through data analysis.
Designing, building, and maintaining the Kubernetes - based developer platform for Schwarz IT Barcelona. Collaborating with engineering teams to enhance services in Azure and Google Cloud.
Database Reliability Engineer managing MySQL database infrastructure at PointClickCare. Collaborating with Engineering and SRE teams for product development and reliable integration across the platform.
Teamleitung in der Gebäudereinigung in Grimma, verantwortliche Planung, Organisation und Führung des Reinigungsteams. Aktive Mitarbeit und Einhaltung von Hygiene - und Qualitätsstandards sind erforderlich.
Service Reliability Engineer providing technical support and managing incidents for BT International. Ensuring system availability and collaboration with global stakeholders to achieve objectives.
Studying Bachelor of Arts in Accounting, Taxation, and Economic Law while gaining practical experience in a dynamic team. Benefit from a diverse working day and continuous development opportunities.
Technical Trainer conducting workshops and training sessions on MERKUR Group's product content for diverse audiences. Engaging with employees and clients to ensure smooth product operation and understanding.