SRE / Observability Engineer at leading financial services organization focusing on observability and reliability. Building scalable digital platforms and ensuring system stability and user experience.
Responsibilities
Contribute to the design and implementation of observability solutions (HLD, LLD)
Build and operate logging, metrics, and distributed tracing systems
Design and maintain monitoring dashboards and alerting strategies
Support incident analysis and root cause investigations
Drive improvements to system reliability using SRE principles
Define and implement observability standards and best practices
Automate monitoring and operational workflows
Collaborate with infrastructure and application teams to improve system visibility and operability
Requirements
Degree in Computer Science or a related field
2–3+ years of experience with modern observability tools (e.g. Prometheus, Grafana, ELK, Dynatrace, Splunk, OpenTelemetry)
2–3+ years of experience in infrastructure or cloud operations (on-prem and/or cloud)
Hands-on experience with containerized and cloud environments (Kubernetes, AWS, Azure)
Strong understanding of SRE principles and proactive problem-solving
Ability to analyze complex systems and identify patterns across logs, metrics, and traces
Intermediate level of English (technical communication)
Structured thinking, strong communication, and collaborative mindset
**Nice to have:**
Experience in financial or enterprise environments
Familiarity with Agile methodologies
Knowledge of large-scale integration architectures
Experience applying ML/AI in observability use cases
Benefits
Competitive compensation and comprehensive benefits package
Hybrid working model with home office flexibility
Support for professional development and continuous learning
Access to health and sports programs
Opportunity to shape observability strategy in a large-scale environment
DevSecOps Engineer responsible for CI/CD pipeline design, infrastructure automation, and ensuring operational reliability in a fast - growing AI startup.
DevOps Engineer defining DevOps strategies and collaborating with teams at Pacific Programming and Tech. Building infrastructure and processes for software solutions in a hybrid environment.
Senior DevOps Engineer managing Azure cloud infrastructure for AI solutions in healthcare. Architecting and maintaining multi - tenant Azure environments while ensuring compliance and security.
Senior DevOps Engineer at Leidos contributing to mission - critical programs for national security. Focusing on platform architecture, automation, and cloud infrastructure solutions.
DevSecOps Engineer modernizing multi - cloud environments for Leidos. Collaborating across AWS, Azure, Google, and Oracle clouds to support mission - critical systems.
Associate DevOps Engineer enhancing application operations for secure digitization solutions at Bundesdruckerei GmbH. Collaborating on CI/CD processes in an agile team setting.
Support AI and DevOps platforms at Citi Finance, ensuring operational stability and effective incident resolution, while collaborating with engineering teams.
AWS DevOps Engineer responsible for building AI platform infrastructure focusing on automation and scalability at Brillio. Join a leading digital technology service provider in the US.
Site Reliability Engineer working on Cloud SaaS - Environment, prioritizing IT Security. Collaborate with development teams in a hybrid model from Aachen or Paris.
Lead DevOps Engineer at ONIQ managing multi - cloud infrastructure and data strategy. Collaborating directly with senior development team to shape architectural decisions and operational excellence.