Software Engineer building reliable distributed systems and services for Whatnot’s hybrid workforce. Focused on improving system reliability while collaborating with cross-functional teams.
Responsibilities
build distributed systems, services, and frameworks that improve the reliability of the entire platform
focus on making reliability a built-in property of our systems as scale, traffic, and complexity continue to grow
design, build, and operate reliability-focused components, services, and frameworks
shape the standards and practices that guide how software is built and run across Whatnot
partner closely with product, platform, and infrastructure teams to embed reliability concerns into system design, development workflows, and runtime behavior
design and operate traffic control mechanisms, including circuit breakers, rate limiting, backpressure, and graceful degradation
build and evolve load testing frameworks that validate system behavior under sustained, burst, and peak event traffic patterns
build chaos and resilience testing frameworks to proactively surface failure modes and validate recovery behavior
define and implement SLOs, SLIs, and error budgets that guide engineering teams toward the right reliability tradeoffs
develop reliability tooling and services that improve incident detection, response, and automated mitigation
review service architectures and designs with a focus on failure modes, scalability limits, and operational safety
participate in incident response and drive post incident follow ups that reduce repeated failure patterns through systemic fixes
Requirements
5+ years of experience as a software engineer working on large scale distributed systems
Strong fundamentals in designing, building, and operating shared production services and frameworks
Experience with traffic control mechanisms such as circuit breakers and rate limiting
Experience building or operating load testing and chaos testing frameworks
Hands on experience with observability, monitoring, and debugging production systems
Experience working with SLOs, error budgets, and incident response processes
Comfortable in cloud native environments such as AWS or GCP with Kubernetes and infrastructure as code
Strong collaborator with clear written and verbal communication skills
Bonus: experience with high traffic, real time, or event driven systems
Benefits
flexibility to work from home or from one of our global office hubs
in-person time for planning, problem-solving, and connection
Director of Software Engineering at Acuity leading AI - enabled digital commerce platform development and transforming user experience with modern architecture.
Senior Product Engineer leading application and integration of protection and control solutions by Hubbell. Collaborating with engineering, sales, and customer support to deploy tailored technical solutions.
Software Engineer leading a team to develop high quality software solutions for DoD training systems. Supporting the JTSE program at Joint Staff Complex in Suffolk, VA.
Lead Principal Engineer Specialist at SAE facilitating aviation standards through technical management and collaboration. Recruiting and mentoring volunteers while driving continuous improvement initiatives in a hybrid work environment.
Product Engineer overseeing the technical lifecycle of screening and biomass handling products for Valmet. Collaborating with global teams and providing engineering expertise across the product lifecycle.
Lead ETL Developer responsible for ETL solutions involving data integration and automation. Working in a hybrid environment at Canada Life with a strong emphasis on collaboration.
Senior Software Engineer developing high - quality software solutions for Savanta. Collaborating with cross - functional teams in a hybrid work environment to deliver impactful products.
Technical Lead developing and evolving iTakeControl, a clinical trial patient engagement platform at Red Nucleus. Leading in - house product development with a focus on compliance and mentoring engineers.
Principal Software Engineer developing and enhancing secure software systems for Northrop Grumman's CHORD portfolio. Focused on collaboration, team empowerment, and personal responsibility in a complex technical environment.
Software Engineer developing Python applications on Linux for Northrop Grumman's Space Sector. Collaborating with cross - functional teams to deliver secure, scalable software in a SCIF environment.