Cloud Infrastructure Manager overseeing 24/7 multi-cloud operations for SaaS portfolio. Leading SRE team and ensuring cloud infrastructure supports business growth.
Responsibilities
Lead day-to-day cloud operations (deployments, monitoring, incident resolution, change management, and system administration) across multi-cloud environments, with a strong focus on AWS, in a 24/7 multi-region SaaS setup.
Provide hands-on technical leadership for complex infrastructure projects, system upgrades, and major incident responses, serving as the escalation point for the team.
Build, lead, and develop a high-performing SRE team through coaching, mentoring, and feedback, while setting clear performance goals and career paths.
Collaborate with the central FinOps team to drive cloud cost optimisation, implementing cost monitoring frameworks to achieve budget targets.
Lead technical integration of newly acquired businesses into Ideagen’s infrastructure, ensuring smooth consolidation and standardisation with minimal disruption.
Implement robust observability strategies—monitoring, logging, alerting, and dashboards—for proactive issue detection and stakeholder visibility.
Partner with Security teams to ensure all infrastructure meets cybersecurity and compliance standards, including participation in audits.
Oversee vendor management, capacity planning, disaster recovery, and business continuity to maintain high availability, strong performance, and continuous improvement through automation and process optimisation.
Requirements
Proven experience managing 24/7 production environments in multi-region, multi-vendor cloud SaaS platforms.
Cloud expertise: Hands-on experience with AWS and/or Azure infrastructure, including IaaS, PaaS, and managed services.
Database management: Experience with both on-premises and cloud databases (e.g. MSSQL, MySQL, PostgreSQL, Aurora, SQL Azure).
Modern DevOps practices: Proficiency with containerization, orchestration, and IaC tools (Docker, Kubernetes, Helm, Terraform).
Leadership experience: Track record of successfully leading and developing technical teams.
Communication skills: Excellent written and verbal communication with ability to engage technical and business stakeholders.
Problem-solving: Strong analytical and strategic thinking capabilities with a bias toward action.
Agile experience: Familiarity working in agile development environments with cross-functional teams.
Desirable: Compliance frameworks: Experience with ISO27001, SOC2, FedRAMP, or similar compliance standards.
Service Management: Understanding of ITIL service management framework.
Cost optimisation: Experience with cloud FinOps practices and cost management tools.
Observability platforms: Experience with tools like New Relic, Datadog, Prometheus, or similar.
Software Development Lifecycle: Deep understanding of SDLC and CI/CD practices
Scripting/automation: Proficiency in Python, Bash, PowerShell, or similar languages.
Acquisition integration: Experience integrating infrastructure from M&A activities.
Software Engineer integrating hardware features into core platforms at Red Hat. Collaborates with teams to ensure seamless operations between hardware and software.
Director of Cloud Platform Engineering overseeing cloud platform capabilities for critical healthcare services. Leading a large engineering organization with emphasis on automation and secure - by - design architectures.
Azure Cloud Architect designing and implementing multi - cloud solutions for Kyndryl customers. Leading client workshops and co - creation sessions to optimize technology strategies.
Senior Cloud Engineer for Blend Technologies specializing in cloud solutions and infrastructure. Drive projects to enhance innovation and technology in the digital transformation era.
Associate Cloud Engineer working with Library Technology at Vanderbilt University to enhance digital collection discoverability. Collaborating with teams to develop secure applications and manage cloud services.
Network Engineer developing and maintaining secure networks for DoD missions. Collaborating with various teams and ensuring the integrity and functionality of network systems.
Principal Engineer leading the shift towards modern service based architectures at ClearPoint. Focus on deep expertise in JavaScript, TypeScript, and cloud native deployments.
GCP Engineer specializing in Agentic AI technologies to design and maintain scalable AI platforms on Google Cloud. Working across the AI lifecycle from pipeline to infrastructure management.
Salesforce Loyalty Management Cloud Architect responsible for designing and implementing loyalty solutions. Focused on integrating Salesforce services and ensuring customer experience in the financial sector.