Cloud Operations Engineer handling advanced troubleshooting and system administration for secure cloud environments. Operating compliance controlled cloud environments and maintaining system stability.
Responsibilities
Advanced troubleshooting for infrastructure, OS, and application issues.
Analyze system logs, metrics, and telemetry from monitoring platforms.
Coordinate with Platform/DevOps Engineers on root cause analysis.
Ensure timely resolution of escalated incidents in accordance with SLAs.
Manage and maintain AWS, Azure, and hybrid environments.
Execute system patching, upgrades, and configuration changes.
Perform health checks, deployment validations, and post-change verifications.
Maintain infrastructure documentation and system configuration inventories.
Perform advanced application troubleshooting for web-based applications.
Work with DevOps/Platform teams to optimize CI/CD deployment workflows.
Requirements
3–5 years of experience in cloud operations, system administration, or infrastructure support.
Hands-on experience with CrowdStrike Falcon endpoint protection.
Hands-on experience using Grafana or Datadog for operational monitoring.
Proficiency in command-line troubleshooting.
Strong working knowledge of AWS and/or Azure infrastructure services.
Familiarity with CI/CD pipelines and deployment automation tools.
Understand advanced application troubleshooting techniques for web-based applications.
Experience writing and maintaining scripts (Bash, Python, PowerShell).
Familiarity with FedRAMP, NIST 800-53, or similar compliance environments.
Required Certifications: AWS SysOps Administrator, Microsoft Azure Administrator, CompTIA Security+.
Due to the nature of our work with federal government clients, this position requires U.S. citizenship.
Infrastructure & Cloud Operations Engineer managing AWS and hybrid environments for CV - Library. Hands - on role focused on reliability, automation, and operational excellence.
Site Reliability Engineer building reliable and scalable infrastructure for fintech startup Pave Bank. Collaborating with internal teams to enhance banking platform performance and reliability.
Lead DevOps Engineer managing DevOps projects for high - quality strategy games at Twin Harbour Interactive. Collaborating with teams to optimize production systems and improve development workflows.
Software Engineer contributing to the observability team's development of visibility systems. Implementing a high - performance telemetry platform and supporting AI tools for engineering teams.
Site Reliability Engineer working on the post - RPA Agentic Automation Platform for enterprises. Responsible for developing scalable systems and improving operational reliability.
Senior DevOps Platform Engineer at Humana designing secure cloud infrastructure for healthcare technology. Responsible for CI/CD pipelines and compliance in regulated environments.
Site Reliability Engineer enhancing Dovetail's platform for AI - driven customer insights. Collaborate with cross - functional teams to ensure operational excellence and support customer needs.
DevOps Engineer shaping technological infrastructure in the private banking sector at Azqore. Developing next - generation platform and CI/CD pipeline in Singapore's Digital Infrastructure team.
DevOps Engineer responsible for building and supporting products across Life Sciences portfolio. Ensuring reliability and performance of platform and enhancing developer experience.