Critical role in capacity management for VMware-based infrastructure supporting BT's Private Cloud platform. Transforming capacity management into a proactive capability aligned with business demand.
Responsibilities
Own the Private Cloud “EC.3” Capacity Management Platform – act as the single accountable owner for capacity planning, forecasting, modelling, and optimisation across the VMware-based Enterprise Cloud v3 environment.
Define and Deliver the Capacity Roadmap – translate business demand and programme milestones into a prioritised backlog of features and automation, using Agile delivery practices.
Implement SRE Guardrails – establish SLIs, SLOs, and error budgets for infrastructure-related reliability; ensure proactive risk management.
Develop Forecasting Models – build accurate short-, medium-, and long-term capacity forecasts using telemetry and scenario analysis to prevent saturation and ensure headroom.
Automate Capacity Workflows – reduce manual toil by creating scripts, policies, and integrations for rightsizing, placement, and quota enforcement using PowerCLI, APIs, and IaC.
Maintain Real-Time Telemetry & Dashboards – provide a single source of truth for utilisation, trends, and optimisation opportunities through VMware Aria Operations (vROps) and reporting tools.
Optimise Cost and Efficiency – align with FinOps principles to deliver show back/chargeback reporting, identify waste, and implement cost-saving measures without compromising reliability.
Integrate with ITSM & Governance – ensure ServiceNow CMDB accuracy, automate request fulfilment, and maintain compliance with capacity policies and audit requirements.
Collaborate Across Teams – work closely with Architecture, Programme Delivery, Finance, and Operations to align capacity decisions with strategic objectives and risk appetite.
Continuously Improve – evolve the capacity management capability through iterative enhancements, stakeholder feedback, and adoption of emerging best practices.
Requirements
Deep VMware Expertise – hands-on experience with vSphere, vCenter, vSAN, NSX-T, and VMware Aria Operations (vROps) for capacity analytics and optimisation.
Capacity Planning & Forecasting – ability to model demand, headroom, and growth scenarios using telemetry and data-driven methods.
Automation & Scripting – proficiency in PowerCLI, Python, and API integrations to automate rightsizing, placement, and quota enforcement.
Agile Delivery Skills – experience managing backlogs, writing user stories, and delivering incremental improvements through sprints and ceremonies.
SRE Practices – strong understanding of SLIs, SLOs, error budgets, and reliability engineering principles applied to infrastructure capacity.
Observability & Analytics – ability to design dashboards and alerts for utilisation, saturation, and optimisation opportunities.
FinOps Awareness – knowledge of cost optimisation, show back/chargeback models, and unit economics for infrastructure services.
Governance & Compliance – familiarity with ITSM tools (e.g., ServiceNow), CMDB data integrity, and audit-ready processes.
Stakeholder Engagement – excellent communication and influencing skills to align capacity decisions with business priorities.
Continuous Improvement Mindset – proactive approach to evolving processes, reducing toil, and adopting emerging best practices.
From January 2025, equal family leave: receive 18 weeks at full pay, 8 weeks at half pay and 26 weeks at the statutory rate. It’s for all parents, no matter how your family is made up.
Enhanced women’s health support: including help with menopause symptoms, cancer screenings, period care and more.
25 days annual leave (not including bank holidays), increasing with service
24/7 private virtual GP appointments for UK colleagues
2 weeks carer’s leave
World-class training and development opportunities
DevOps Lead at Leidos managing platform engineering, SRE, and application security functions. Driving operational excellence and ensuring scalability for federal government applications.
SRE Lead developing scalable cloud - native solutions for mission - critical systems supporting USAF. Managing teams, collaborating with cross - functional units, and ensuring high service reliability standards.
Junior DevOps / Platform Engineer at DieEnergiekoppler GmbH managing AWS/EKS platform operations. Collaborating with team members to improve platform functionalities and security compliance.
DevOps Engineer responsible for AWS infrastructures and backend development at Allguth GmbH. Engaging in greenfield projects with modern solutions in a collaborative team.
Cloud DevOps Specialist responsible for building scalable infrastructure solutions in AWS at SONDA. Focusing on automation, containerization, and data management in a collaborative environment.
DevOps Engineer maintaining and evolving deployment pipelines for Docebo’s AI - powered learning platform. Collaborating with cross - functional teams to ensure efficient software releases and infrastructure management.
DevOps Engineer optimizing CI/CD pipelines for Docebo, an AI - powered learning platform. Involves managing multi - tenant infrastructure using AWS, Docker, and Kubernetes.
DevOps Engineer maintaining and automating infrastructure and CI/CD processes for cybersecurity solutions by NordLayer. Collaborating with teams to ensure performance and scalability of cloud services.
DevOps Engineer maintaining and improving infrastructure and CI/CD processes for cybersecurity solutions provider. Collaborating with cross - functional teams for reliable and scalable cloud solutions.
DevOps Engineer maintaining and automating infrastructure and CI/CD processes at NordLayer. Collaborating with Senior Engineers to implement best practices in a dynamic cybersecurity environment.