Production Support & Monitoring Engineer ensuring reliability, performance, and availability for Exegy's production systems. Collaborating with teams to resolve incidents and optimize environments.
Responsibilities
Monitor production systems and infrastructure, ensuring uptime and performance metrics are met
Troubleshoot, diagnose, and resolve production issues in real time, minimizing service impact
Manage incident response, including escalation, root cause analysis, and post-mortem reporting
Collaborate with engineering teams to develop and implement monitoring tools, alert systems, and automated recovery processes
Analyze system logs, metrics, and trends to proactively identify potential risks or issues
Execute software deployments, configuration changes, and system upgrades with minimal disruption
Maintain and refine operational runbooks, escalation procedures, and best practices.
Drive continuous improvement by identifying areas for process optimization and operational efficiency
Participate in an on-call rotation to provide 24/7 support for production systems
Requirements
Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent work experience
2+ years of experience in production support, system administration, or monitoring role
Strong technical skills in Linux/Unix environments, with experience in troubleshooting and debugging
Hands-on experience with monitoring tools (e.g., ITRS, Prometheus, Grafana, Splunk) and incident management platforms
Scripting experience (e.g., Python, Bash) to automate monitoring and reporting tasks
Excellent problem-solving and analytical skills, with the ability to work under pressure in a fast-paced environment
Solid understanding of networking, system performance, and application monitoring concepts
Exceptional communication and collaboration skills to coordinate with cross-functional teams effectively
Benefits
Monitor production systems and infrastructure, ensuring uptime and performance metrics are met
Troubleshoot, diagnose, and resolve production issues in real time, minimizing service impact
Manage incident response, including escalation, root cause analysis, and post-mortem reporting
Collaborate with engineering teams to develop and implement monitoring tools, alert systems, and automated recovery processes
Analyze system logs, metrics, and trends to proactively identify potential risks or issues
Execute software deployments, configuration changes, and system upgrades with minimal disruption
Maintain and refine operational runbooks, escalation procedures, and best practices.
Drive continuous improvement by identifying areas for process optimization and operational efficiency
Participate in an on-call rotation to provide 24/7 support for production systems
Forward Deployed Engineer transforming customer data into actionable insights at paretos. Collaborating with customers to solve complex data challenges in a hybrid work environment.
Project Engineer role focusing on developing BMS software and supporting technical teams at Carrier. Responsibilities include engineering documentation and optimizing control systems.
Professional Engineer collaborating with municipal clients to deliver drinking water projects. Working on design, project management, and field coordination in New England states.
Projects Engineer responsible for executing technical projects and client implementations for ClearFuze. Collaborating with teams to deliver exceptional client experiences while maintaining documentation and workflows.
Lab Engineer supporting Government activities in Herndon, VA or Annapolis Junction, MD. Responsibilities include system design, implementation, and maintaining security of lab environments.
Industrial Engineer in Motorsports at MAHLE responsible for production control and system optimization. Collaborating on project planning and standard operating procedures for engineering.
Automotive engineer developing packaging solutions for autonomous vehicles, electric mobility, and digitalisation. Optimising performance and sustainability through continuous innovation for the automotive industry.
Automotive engineer developing solutions for autonomous vehicles and electric mobility at Expleo. Optimizing performance and sustainability through continuous innovation in the mobility industry.
Senior Desktop Engineer leading the engineering team to develop and implement solutions at Sun Life. Requires strong technical skills and experience in healthcare.
Manufacturing Engineer optimizing processes for high - precision motion systems at PI. Collaborate with assembly teams and ensure products are built correctly and on schedule.