Solution Architect developing comprehensive AI infrastructure solutions for deployment at d-Matrix. Collaborating with clients to enable successful integration of d-Matrix based solutions.
Responsibilities
Develop end-to-end AI infrastructure reference solutions optimized for d-Matrix servers including compute, networking, storage, and orchestration layers, in collaboration with various internal teams.
Create reference blueprints that integrate smoothly into cloud-native and on-prem environments.
Develop infrastructure-as-code templates and examples using Ansible, Terraform, and Helm for provisioning d-Matrix-based nodes and clusters.
Integrate with Kubernetes-based systems to enable model deployment, auto-scaling, and fault-tolerant execution.
Design and deploy telemetry and monitoring frameworks to support real-time visibility into d-Matrix cluster health, job status, and system performance.
Integrate with industry-standard observability stacks (e.g., Prometheus, Grafana, OpenTelemetry) for data collection, visualization, and alerting.
Develop dashboards, health check systems, and metric pipelines that track performance, availability, and operational KPIs
Collaborate with performance and software teams to validate infrastructure using real-world workloads and benchmarks.
Incorporate telemetry hooks for benchmark reporting and feedback-driven tuning.
Create and publish detailed infrastructure deployment guides, monitoring configuration templates, and operational best practices.
Collaborate with customers and OEM/ISV ecosystem, enable them to adopt and customize reference solutions to their specific datacenter environments and/or software stacks.
Requirements
Bachelor's or Master’s degree in Computer Science, or related technical field
10+ years of experience in infrastructure solution architecture, systems management, DevOps, or platform engineering roles.
Experience working with GPUs, custom AI accelerators or heterogeneous compute environments.
Proven expertise in building, managing, and monitoring full-stack AI infrastructure at scale.
Surgical Solutions Manager in Sports Medicine delivering innovative surgical solutions and expert clinical support. Managing accounts and leading product trials across Munster/Leinster areas.
Solution Architect specializing in Storage & Virtualisation at Axians Schweiz. Analyzing IT infrastructure needs and developing tailored solutions for clients.
Lead Solution Architect delivering solutions architecture and bidding excellence for public sector bids. Collaborating with teams across projects and ensuring competitive proposals across UK.
Solution Architect role at HPE focusing on OSS Assurance products and technical implementations. Engaging in the hybrid model engaging with clients and providing solutions.
AWS Solution Architect leading end - to - end software development projects with Java, Python, Typescript, AWS, React, and Angular. Driving excellence in SDLC processes and delivering scalable solutions.
Engineer focused on enhancing process capabilities and reducing costs in semiconductor manufacturing for Micron. Involved in management projects and evaluation of new equipment and materials.
Customer Solutions Consultant managing high - quality implementations of Esper's platform for government organizations. Collaborating cross - functionally and ensuring positive customer outcomes during the implementation lifecycle.
Senior Cloud Migration Lead driving cloud migrations and modernizing applications at Boeing. Leading cross - functional teams for public cloud migrations while mentoring engineers and product teams.
Customer Solutions Engineer facilitating customer onboarding and implementation of a B2B SaaS platform. Enhancing client experience and operational excellence in regulatory technology.
MES Integration Engineer deploying and managing a commercial Manufacturing Execution System for quantum chip fabrication. Focused on integrating complex fabrication flows and supporting R&D environment.