Principal Engineer shaping reliability strategy for Saviynt’s critical SaaS platform in Federal. Engaging in infrastructure design and management at a large scale.
Responsibilities
Define and drive the reliability strategy for our SaaS platform.
Design, build, and maintain the shared infrastructure services and platforms that our product and application teams will depend on.
Hold teams accountable to meet customer facing Service Level Agreements (SLAs).
Design Continuous Delivery (CD) processes for government deployments that will eventually be used commercially.
Develop robust internal-facing tools and automation for infrastructure provisioning and management primarily using Go (Golang) or Python.
Architect and optimize foundational solutions within Cloud environments (AWS, Azure, etc.).
Design and implement shared Event-Driven Architecture components and messaging platforms using technologies like Kafka or Google Pub/Sub.
Design and build resilient Distributed Systems components that serve as building blocks for other applications.
Manage and optimize our shared infrastructure across Multi-Region Cloud Environments.
Establish and enhance centralized Observability and Monitoring platforms and tools.
Define and implement clear, well-documented RESTful API designs for the infrastructure services you build.
Implement and manage Service Mesh (e.g., Envoy, Istio) capabilities.
Design, implement, and optimize highly available Relational Database services or shared data platforms.
Collaborate closely with product development teams to understand their infrastructure needs and pain points.
Participate in on-call rotations to support the critical shared infrastructure you build.
Requirements
9+ years of experience in an Infrastructure Development, Platform Engineering, or Site Reliability Engineering role, with a strong focus on building tools and services for other engineers
Deep expertise with Kubernetes in production environments, particularly in providing it as a platform(i.e single tenant and multi-tenant deployment architectures)
Strong programming skills in Go (Golang) and Python, with experience building robust, maintainable backend services and automation
Extensive hands-on experience with at least one major Cloud Provider (AWS, GCP, or Azure); multi-cloud experience is a strong plus, especially in building abstractions over them
Proven experience designing and implementing Event-Driven Architecture and message queuing systems (e.g., Kafka, RMQ, NATS) as shared services
Solid understanding and practical experience with CI/CD pipeline tools (especially GitLab CI) and experience establishing automated delivery processes for other teams
Demonstrable experience designing and operating Distributed Systems, with an understanding of patterns for creating reliable, shared components
Familiarity with Multi-Region Cloud Environments and strategies for building globally distributed and highly available platform
Proficiency in establishing and utilizing comprehensive Observability and Monitoring platforms (e.g., Prometheus, Grafana, ELK stack, Datadog) for shared infrastructure
Strong experience with RESTful API design principles and building well-documented, consumable APIs
Knowledge of Service Mesh concepts and practical experience with solutions like Istio in a platform context
Hands-on experience with Relational Databases (e.g., MySQL, PostgresSQL), ideally in managing them as a service
Excellent communication skills and the ability to clearly articulate complex technical concepts to both technical and non-technical audiences
A strong customer-centric mindset, treating internal development teams as your primary customers
Bachelor's degree in Computer Science, Engineering, or a related field, or equivalent practical experience or equivalent military experience required
Nice-to-Have Qualifications
Experience with FedRAMP compliance and government security requirements
Track record of implementing secure CI/CD pipelines in restricted or regulated environments
Benefits
Competitive compensation, benefits, and growth opportunities
Principal Software Engineer developing complex software systems for HID Global's secure identity solutions. Involves hands - on coding, architecture decisions, and cross - team collaboration.
Senior Software Engineer powering innovative backend services for Paramount+. Collaborating closely with technical teams while mentoring junior engineers in a dynamic environment.
Senior Engineer at Mercedes - Benz Türk developing diagnostic solutions for Daimler Buses worldwide. Responsible for software development and technical design of diagnostic systems ensuring product quality.
Software Developer creating software tools for aerospace applications at L3Harris Technologies. Collaborating on development initiatives and various software projects, primarily using .NET technologies.
Senior Engineer I at Phillips 66 combining engineering and physics models with ML. Enhancing safety, reliability, and profitability through digital product development.
Customer Success Integration Engineer in IDEMIA responsible for system integration and customer support. Overseeing software validation while collaborating with global teams.
Student assistant position involving Full Stack Development within a leading research institute in Berlin. Contributing to software solutions in process management and industry projects.
Senior Product Engineer responsible for product design and development in mechanical and electrical engineering. Enhancing customer specifications and assuring product quality for mass production at Rogers Corporation.
Director of Software Engineering at Acuity leading AI - enabled digital commerce platform development and transforming user experience with modern architecture.
Senior Product Engineer leading application and integration of protection and control solutions by Hubbell. Collaborating with engineering, sales, and customer support to deploy tailored technical solutions.