Hybrid Production Support Engineer

Posted 3 weeks ago

Apply now

About the role

  • Production Support & Monitoring Engineer ensuring reliability, performance, and availability for Exegy's production systems. Collaborating with teams to resolve incidents and optimize environments.

Responsibilities

  • Monitor production systems and infrastructure, ensuring uptime and performance metrics are met
  • Troubleshoot, diagnose, and resolve production issues in real time, minimizing service impact
  • Manage incident response, including escalation, root cause analysis, and post-mortem reporting
  • Collaborate with engineering teams to develop and implement monitoring tools, alert systems, and automated recovery processes
  • Analyze system logs, metrics, and trends to proactively identify potential risks or issues
  • Execute software deployments, configuration changes, and system upgrades with minimal disruption
  • Maintain and refine operational runbooks, escalation procedures, and best practices.
  • Drive continuous improvement by identifying areas for process optimization and operational efficiency
  • Participate in an on-call rotation to provide 24/7 support for production systems

Requirements

  • Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent work experience
  • 2+ years of experience in production support, system administration, or monitoring role
  • Strong technical skills in Linux/Unix environments, with experience in troubleshooting and debugging
  • Hands-on experience with monitoring tools (e.g., ITRS, Prometheus, Grafana, Splunk) and incident management platforms
  • Scripting experience (e.g., Python, Bash) to automate monitoring and reporting tasks
  • Excellent problem-solving and analytical skills, with the ability to work under pressure in a fast-paced environment
  • Solid understanding of networking, system performance, and application monitoring concepts
  • Exceptional communication and collaboration skills to coordinate with cross-functional teams effectively

Benefits

  • Monitor production systems and infrastructure, ensuring uptime and performance metrics are met
  • Troubleshoot, diagnose, and resolve production issues in real time, minimizing service impact
  • Manage incident response, including escalation, root cause analysis, and post-mortem reporting
  • Collaborate with engineering teams to develop and implement monitoring tools, alert systems, and automated recovery processes
  • Analyze system logs, metrics, and trends to proactively identify potential risks or issues
  • Execute software deployments, configuration changes, and system upgrades with minimal disruption
  • Maintain and refine operational runbooks, escalation procedures, and best practices.
  • Drive continuous improvement by identifying areas for process optimization and operational efficiency
  • Participate in an on-call rotation to provide 24/7 support for production systems

Job title

Production Support Engineer

Job type

Experience level

JuniorMid level

Salary

Not specified

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job