Senior DevOps Engineer building and operating developer platforms for reliable production shipping at Demandbase in Hyderabad. Focused on improving developer experience and cloud infrastructure.
Responsibilities
Build and operate the platforms, tooling, and workflows that enable engineers to ship reliably to production.
Partner with software, data, and security engineering teams to identify friction across the software delivery lifecycle and address it through automation, platform abstractions, and improved workflows.
Design and evolve developer-facing platforms and tooling that standardize how services and pipelines are built, deployed, and operated.
Enable self-service workflows with opinionated defaults that improve reliability, security, and consistency without slowing teams down.
Use developer feedback, operational data, and production signals to prioritize and drive the DevEx roadmap.
Design, build, and maintain CI/CD orchestration that supports high release velocity, strong security guardrails, and local-to-production parity, preferably using GitLab CI/CD.
Standardize build, test, and deployment patterns across application and data workloads.
Support modern deployment strategies and GitOps-based workflows.
Build, operate, and evolve Kubernetes-based platforms across AWS and GCP, including EKS and GKE.
Enable teams to run workloads on Kubernetes by providing clear operational guardrails, platform defaults, and documented best practices.
Manage multi-account cloud environments with a focus on security, scalability, and ease of use.
Design and maintain infrastructure using Infrastructure as Code, including Terraform and Crossplane.
Build and operate internal platform components such as GitOps tooling, secret management systems, and service mesh infrastructure.
Operate and evolve observability platforms (e.g., Prometheus, Mimir, Thanos, Grafana, Datadog) to provide actionable signals for platform and application teams.
Define and apply SLIs, SLOs, alerting strategies, and incident response practices.
Lead and participate in blameless post-mortems, translating learnings into platform improvements and reduced operational toil.
Support engineering teams running data pipelines and batch workloads on platforms such as Airflow, EMR, and Dataproc.
Standardize deployment, observability, and operational patterns for data workloads.
Improve reliability and operability of data platforms through shared tooling and best practices.
Serve as a technical leader within DevEx, promoting best practices in platform engineering, reliability, and secure software delivery.
Mentor engineers and influence teams through strong technical design, documentation, and collaboration.
Drive adoption of internal platforms through strong defaults, clear documentation, and self-service tooling.
Requirements
8+ years of overall engineering experience, including hands-on software development and cloud infrastructure ownership.
Strong software engineering fundamentals with experience in at least one general-purpose programming language (e.g., Go, Python, Java).
5+ years of experience building and operating cloud infrastructure on AWS and/or GCP at scale.
Proven experience managing multi-account cloud environments, including IAM, networking, and security best practices.
Strong proficiency with Infrastructure as Code, preferably Terraform and Crossplane.
Extensive experience operating Kubernetes platforms in production, including EKS and/or GKE.
Experience managing multiple Kubernetes clusters, including upgrades, networking, and security.
Hands-on experience with service mesh technologies such as Istio in multi-cluster environments.
Deep experience designing and operating CI/CD systems that support high release velocity, preferably GitLab CI/CD.
Experience building developer-facing tooling that improves local-to-production parity and reduces cognitive load.
Familiarity with GitOps practices and modern deployment strategies.
Experience supporting data platforms such as Airflow, EMR, and Dataproc.
Strong experience building and operating observability platforms including Prometheus, Mimir, Thanos, Grafana, and Datadog.
Solid understanding of SLIs, SLOs, alerting, and incident response.
Demonstrated ability to partner with engineering teams to identify pain points and improve developer experience.
Strong communication skills, including experience participating in or leading blameless post-mortems.
Benefits
Group Medical
Personal Accident
Term Life Insurance
Preventive healthcare including dental, vision, and OPD needs
Senior Executive supporting technology initiatives in Pune, India. Collaborating globally to connect people and solve complex challenges in a sustainable manner.
DevOps Engineer leading the design, implementation, and optimisation of Kubernetes platforms for Vodafone. Collaborating with product teams to streamline operational processes and enhance developer experience.
Senior Site Reliability Engineer developing scalable systems and automation for high - scale projects at Euna Solutions. Collaborating closely with software developers and mentoring junior engineers.
Senior Site Reliability Engineer responsible for designing scalable systems at Euna Solutions. Collaborating with developers and mentoring juniors while driving automation and reliability.
Senior Site Reliability DevOps Specialist at Boeing overseeing GCP cloud environment and infrastructure. Ensuring reliability, scalability, and automation while collaborating with distributed teams.
Lead DevOps Engineer driving modernization and operational excellence for Enterprise Payments at American Family Insurance. Collaborate across teams and enhance payment processing capabilities.
Senior DevOps Engineer at Fidelity leading operational excellence of production reporting applications. Responsible for stability, reliability, and cloud migration initiatives in a hybrid work environment.
Senior Site Reliability DevOps Specialist for Boeing, focusing on cloud technology and automation in GCP environments. Collaborate globally to enhance system reliability and performance with a diverse tech stack.
SRE Team Lead in charge of reliability strategy and operational maturity for a cybersecurity SaaS platform. Leading a specialized team to enhance system performance and incident management.
Junior DevOps Engineer implementing continuous integration and deployment architecture for the Defense Logistics Agency. Debugging cluster - based computing while using various configuration management tools.