Senior Software Engineer responsible for production incident response and system reliability for a B2B WealthTech startup. Managing Go and Node.js services in a hybrid tech environment.
Responsibilities
Lead and execute production incident response: triage, mitigation, stakeholder communication, and coordination across teams
Debug and fix issues across Go services (mandatory) and the broader stack (Node.js services where relevant)
Work across service boundaries: GraphQL/RPC, distributed tracing, dependency failures, performance bottlenecks, and safe degradation patterns
Troubleshoot Kubernetes workloads and deployments
Diagnose PostgreSQL/CNPG issues
Handle production bugs that span application + data pipelines (ETL/Snowflake mappings), including backfills/replays and data-quality validation
Reliability patterns common to trading/fintech platforms: correctness and data integrity mindset (idempotency, reconciliation), resilient partner integrations, and strong observability for critical user journeys
Leading DevOps platform strategy for KIPMI Software's next - generation digital trust products. Collaborating with teams to implement scalable infrastructure and DevSecOps practices.
Join our DevOps team to build and manage GitHub pipelines and cloud - native Azure solutions. Collaborate with teams to drive DevOps best practices and optimize deployments.
Site Reliability Engineer enhancing system reliability and deployment practices at OpenLoop. Collaborating with cross - functional teams for incident management and performance tuning.
Senior DevOps Engineer enhancing Azure application reliability for a healthcare fintech platform. Collaborating closely with engineering teams to ensure deploy safety and observability.
DevOps Engineer contributing to tooling changes and leading a community of practice at Totara. Focused on collaboration, development, and support for internal teams.
Site Reliability Engineer responsible for infrastructure supporting AI platform. Safeguarding US customer data and ensuring compliance in the Aerospace and Defense sector.
Senior Infrastructure Engineer managing Azure platform for a SaaS product at Rillion. Focused on automation, security, reliability, and scalability in a hybrid work environment.
Statistician/Reliability Engineer applying statistical analysis for satellite systems at Aerospace Corporation. Leading projects on system reliability and working closely with interdisciplinary teams in a full - time on - site role.
DevOps Engineer designing and implementing solutions to optimize operations in media technology at Mediagenix. Collaborating with cross - functional teams to enhance user experiences.
DevOps Senior Software Engineer at SimCorp developing high - quality software solutions for financial technology. Responsible for mentoring junior engineers and solving complex technical challenges.