Intermediate Site Reliability Engineer at PointClickCare, designing AI-powered solutions for observability and automation in healthcare systems. Focus on building resilient infrastructure using AI and ML.
Responsibilities
Build ML-based anomaly detection and pattern recognition systems.
Enhance telemetry with smart tagging and metadata for better AI insights.
Develop event-driven workflows and self-healing systems using AI triggers.
Automate incident response with generative AI and custom AI agent orchestration.
Use time-series forecasting and predictive modelling to anticipate failures.
Optimise infrastructure with AI-powered autoscaling and cost-aware resource allocation.
Build scalable, fault-tolerant systems in a cloud-native environment.
Participate in on-call rotations and lead incident response for critical systems.
Skilled in API integration for streamlined data exchange and system connectivity.
Run internal AIOps workshops and help teams adopt AI maturity models.
Champion responsible AI practices and ethical automation.
Requirements
5+ years experience in software engineering.
Experience with SRE principles.
Experience with AI/ML in production environments
A passion for automation, intelligent systems, and operational excellence
Strong debugging, problem-solving, and system design skills
Senior Site Reliability Engineer managing the reliability and operational health of the Loan Origination System for a fintech company. Collaborating with engineering teams in Brazil and the US to improve system reliability.
Cloud Engineer working with Azure DevOps and digital transformation in a global team at EY. Collaborating on cloud engineering projects and supporting CI/CD pipeline development.
DevOps Engineer creating better conditions for developers in Saab's defence technology. Collaborating with developer teams for effective continuous development and delivery of software.
Ingénieur Infrastructure DevOps chez Bull, renforçant l'équipe AdminLab Echirolles. Travailler sur des infrastructures Linux et des pratiques d'automatisation dans un environnement HPC.
Product Quality & Reliability Engineer developing quality/reliability standards for Applied Materials. Design methods for testing products and analyze operational data in a supportive team environment.
DevOps System Engineer creating and managing infrastructure for ESET's global SaaS service. Collaborating with tech teams to maintain secure and stable operations.
Provides expertise in business applications design and functionality. Supports users and validates technical designs for alignment with business needs.
Senior Site Reliability Engineer supporting the reliability and performance of Broadridge’s fintech platform. Collaborating with senior engineers on automation, infrastructure, and production stability.
DevOps Engineer at Mindera focusing on Windows environments and Azure cloud solutions. Involves system modernization, automation, and migration projects with collaborative teams.
DevSecOps Lead supporting Synthesized's cloud automation strategy with a focus on security and compliance. Collaborating closely with development teams to shape cloud architecture and enhance deployment processes.