Solution Architect developing comprehensive AI infrastructure solutions for deployment at d-Matrix. Collaborating with clients to enable successful integration of d-Matrix based solutions.
Responsibilities
Develop end-to-end AI infrastructure reference solutions optimized for d-Matrix servers including compute, networking, storage, and orchestration layers, in collaboration with various internal teams.
Create reference blueprints that integrate smoothly into cloud-native and on-prem environments.
Develop infrastructure-as-code templates and examples using Ansible, Terraform, and Helm for provisioning d-Matrix-based nodes and clusters.
Integrate with Kubernetes-based systems to enable model deployment, auto-scaling, and fault-tolerant execution.
Design and deploy telemetry and monitoring frameworks to support real-time visibility into d-Matrix cluster health, job status, and system performance.
Integrate with industry-standard observability stacks (e.g., Prometheus, Grafana, OpenTelemetry) for data collection, visualization, and alerting.
Develop dashboards, health check systems, and metric pipelines that track performance, availability, and operational KPIs
Collaborate with performance and software teams to validate infrastructure using real-world workloads and benchmarks.
Incorporate telemetry hooks for benchmark reporting and feedback-driven tuning.
Create and publish detailed infrastructure deployment guides, monitoring configuration templates, and operational best practices.
Collaborate with customers and OEM/ISV ecosystem, enable them to adopt and customize reference solutions to their specific datacenter environments and/or software stacks.
Requirements
Bachelor's or Master’s degree in Computer Science, or related technical field
10+ years of experience in infrastructure solution architecture, systems management, DevOps, or platform engineering roles.
Experience working with GPUs, custom AI accelerators or heterogeneous compute environments.
Proven expertise in building, managing, and monitoring full-stack AI infrastructure at scale.
Data Center Solutions Engineer specializing in data architecture and analytics for GXO Logistics. Designing, building, and optimizing data solutions while writing complex SQL queries and creating dashboards.
Senior Solutions Architect designing digital manufacturing solutions at LyondellBasell. Leading architecture for production, reliability, energy, and sustainability use cases with a focus on innovation.
Bilingual Senior Solutions Architect designing end - to - end enterprise solutions for financial services. Collaborating with stakeholders in French and English - speaking regions to develop scalable architectures.
Senior AI Solutions Architect designing and implementing enterprise AI solutions using Microsoft Azure. Collaborating with stakeholders and building scalable AI/ML pipelines.
Solution Architect bridging customer needs and platform engineering at Woven by Toyota. Collaborate with Inventors to design scalable solutions leveraging the Robot Platform.
Senior Solution Architect for Envitia designing modern data - driven solutions. Leading architecture delivery and pre - sales efforts in Defence sector.
Data Solutions Architect at Envitia, delivering innovative data - centric solutions for clients. Collaborating with teams to architect solutions using AWS and Azure technologies while ensuring quality and compliance.
AI Solution Engineer focusing on AI/ML and data execution within platform engineering for federal agencies. Collaborating with a Solution Architect on sprint - based platform releases.
AI Solution Engineer supporting analytics workflows, data ingestion, and ML operations for a Supply Chain Enterprise program. Involves executing ETL processes and maintaining data pipelines in a government context.
Account Solution Architect at Red Hat assisting customers with hybrid cloud solutions. Building relationships and architecting innovative solutions across diverse industries.