Manager leading observability platform operations across logging, metrics, telemetry, and analytics in Canada. Focused on reliability, scalability, and compliance with agile product teams.
Responsibilities
Own platform operations and roadmap (Elastic, Dynatrace, Micro Focus, Grafana)
Manage capacity, cost, performance, and security
Govern logging, telemetry, tracing, topology, and data lifecycle/quality
Publish standards and guardrails; ensure compliance via gating and maturity checks
Align governance with enterprise architecture
Manage vendor relationships in collaboration with the Director
Build partnerships across IT, application owners, infrastructure, and SDLC stakeholders
Coach on instrumentation, alert hygiene, dashboards, tracing, and topology
Lead the observability community and deliver shared trainings and templates
Communicate platform health, adoption, coverage, and outcomes
Enrich signals with app, infra, and network data; apply anomaly detection and AI to reduce noise
Provide reusable dashboards, alert policies, runbooks, and instrumentation patterns
Strengthen incident response, major-incident support and contribute to post-mortems
Implement enhancements that lower detection time and MTTR
Automate provisioning, config-as-code, data onboarding, alerting, and visualization
Embed observability in CI/CD and pre-release checks; promote “observability by default”
Support SRE goals by enabling SLIs/SLOs/SLAs and improving reporting
Manage two Agile product teams and a 24/7 Operations Center
Develop talent and a culture of automation, reliability, and customer service
Maintain backlog and roadmap; prioritize features and cost; drive continuous improvement and report outcomes
Requirements
6+ years in observability/platform engineering, with 2+ years leading ops/platform teams
Hands-on expertise with observability tooling (Elastic Stack, Dynatrace, Grafana, Micro Focus) and pipelines for logging, metrics, tracing, and topology
Experience building automation and self-service for observability (IaC, CI/CD, config-as-code) and integrating observability into the SDLC
Familiarity with multi-cloud (AWS, Azure, GCP) and on-prem environments; hybrid infrastructure visibility across Canada
Background in 24/7 operations and service management
Strong communication, stakeholder partnership, coaching, and vendor management skills
Experience with AI/ML anomaly detection and analytics in observability contexts
Familiarity with SAFe/Agile product management and platform roadmaps
Exposure to performance engineering and enterprise architecture standards
Financial Manager responsible for financial strategy and controlling at IRON Media GmbH. Collaborating closely with management to enhance company growth and drive strategic decisions.
Project Manager for organizing digital events at drink&paint startup aiming to become Europe's largest creative events provider. Collaborating with teams to ensure smooth event execution.
Country Manager overseeing strategic realignment in the German generics market. Ensuring reliable medication delivery and preventing supply shortages through proactive management.
Asset Management Manager leading asset, inventory, and logistics operations for PayFacto. Collaborating with various teams to drive operational excellence while ensuring robust management processes.
Facilities Maintenance Manager leading the maintenance team for Honeywell's facilities. Overseeing operations to maintain high standards and contribute to overall business success.
Assistant Manager for Patient Access Scheduling at Connecticut Children’s. Supervising daily operations and staff development in a healthcare environment.
Senior Manager, EHS APAC providing strategic H&S guidance, collaborating with global H&S community at Digital Realty. Leading implementation of safety standards and engaging with contractors and regulatory bodies.
Housing Case Manager providing services to chronically homeless adults in Nashville area. Focus on case management, training, and housing stability for clients.
Testing Manager managing a team of test specialists at Information Technology Strategies, Inc. Overseeing complex test programs for government IT projects with an emphasis on quality and methodology.
Lead the Kalispell market as a Branch Manager and lead sales executive for Stockman Insurance. Responsible for personnel administration and promoting quality customer service.