Principal IaaS Engineer leading architecture, standardization of AI infrastructure. Collaborating with data and security teams to enhance global infrastructure platforms.
Responsibilities
Architect and evolve the company’s IaaS platform across hybrid environments (on-premise, distributed), enabling secure and scalable compute foundations.
Design, build, and maintain infrastructure automation frameworks using Terraform, Pulumi, and Ansible, including development of custom providers and modules.
Define and enforce engineering standards for infrastructure provisioning, networking, and observability to ensure reliability, security, and consistency.
Lead evaluation and integration of core technologies including OpenShift, Kubernetes, MAAS, and Ceph to optimize performance, cost, and maintainability.
Drive multi-tenant PaaS initiatives and private cloud modernization leveraging OpenShift, Juju, and S3-compatible storage (Ceph, MinIO, TrueNAS).
Collaborate with Data, ML, and Platform Engineering teams to align IaaS architecture with emerging workloads—data pipelines, MLflow, and Airflow orchestration.
Establish GitOps and CI/CD frameworks (ArgoCD, Helm, GitHub Actions, Azure DevOps) for consistent infrastructure delivery and configuration management.
Lead capacity planning, HA/DR strategy, and monitoring/alerting design using Prometheus, Grafana, and Loki stacks.
Partner with InfoSec to embed zero-trust, OIDC/SAML-based IAM, and secret management best practices into infrastructure lifecycle.
Mentor engineers and contribute to organization-wide technical enablement through documentation, workshops, and community participation.
Requirements
10+ years of experience designing and operating large-scale infrastructure systems across on-prem and cloud environments.
Proven expertise in Infrastructure as Code (Terraform, Pulumi, Ansible) with experience authoring reusable modules and providers.
Deep understanding of hybrid and private cloud platforms (OpenShift, Juju, MAAS, OpenStack, VMware, Proxmox).
Strong background in storage (Ceph, TrueNAS, S3, NFS) and networking (VLAN, VXLAN, SDN) for high-availability architectures.
Demonstrated experience building GitOps-based deployment pipelines and maintaining production-grade Kubernetes environments.
Familiarity with data and ML infrastructure integration—MLflow, Airflow, Databricks, or Spark preferred.
Strong proficiency in Python, Go, and Bash for automation and platform tooling.
Excellent cross-functional leadership, communication, and mentorship skills.
Manager - Systems Engineering at NEC Corporation of America driving technical strategy for digital transformation products and solutions. Leading AI services design, development, and delivery with cross - regional teams.
Business Systems Analyst serving as a liaison between business and IT for process improvements. Responsible for system requirements, quality assurance, and stakeholder management in a hybrid environment.
Business Systems Analyst bridging business, design, and technology at Manulife’s Global Design System. Translating product objectives into requirements ensuring accessibility and technical feasibility.
Thermal Systems Engineer at Ford responsible for designing coolant subsystems for electric vehicles. Pioneering the future of Ford's electric vehicle technology in an agile team.
ADAS Platform Systems Engineering Supervisor at Ford leveraging management skills and expertise in ADAS technologies. Designing, developing, integrating, and supporting cutting - edge driver assistance features.
Senior Engineer Team Lead providing advanced systems engineering expertise at Chartis Federal. Leading system operations and collaborating with technical teams in a government contracting environment.
Change Management & System Integration Specialist executing OCM activities for SAP S/4HANA integration at Boeing. Focusing on training, communications, and readiness for effective transitions.
Senior Software Systems Engineer at Boeing designing and validating secure software systems for defense projects. Leading integration efforts with cross - disciplinary engineering teams in a high - tech environment.
Senior Systems Engineer developing and supporting defense systems' full lifecycle at Raytheon. Collaborating across disciplines to ensure system functionality and mission success.