Senior Site Reliability Engineer at Instabase. Lead SRE team to design, operate and improve SaaS cloud infrastructure, CI/CD, Kubernetes, and production reliability.
Responsibilities
Define and steer the technical direction for your team, collaborating with cross-functional partners
Develop and execute comprehensive short- and long-term roadmaps balancing business needs and user experience
Oversee cloud infrastructure and deployment automation to ensure efficient, reliable operations
Guarantee uptime and reliability for production systems through proactive monitoring and production support
Manage vulnerability assessments and facilitate prompt remediation to maintain security
Maintain and enhance CI/CD and build infrastructure to support seamless development workflows
Implement and optimize tools that enhance developer productivity and streamline processes
Drive improvements in release management processes and tooling to ensure smooth, reliable software delivery
Requirements
5+ years of experience in Site Reliability Engineering, Software Engineering, or Production Engineering
Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field, or equivalent practical experience
Demonstrated experience in managing and sustaining SaaS production environments
Hands-on experience with major cloud providers such as AWS and Azure
Proficient in containerization technologies like Docker
Expertise in container orchestration platforms, especially Kubernetes
Skilled in overseeing and managing software release processes
Systematic approach to solving platform and production issues and a passion for automation
Track record of setting technical and cultural standards for engineering teams
Chassis Controls Software Engineer developing applications for sophisticated systems at Ford. Involves software delivery and calibration management with supplier collaboration in hybrid work setting.
Business Intelligence Developer creating and maintaining Power BI solutions for strategic decision - making. Collaborating with teams to develop scalable BI assets and optimize data reporting.
Drive design and delivery of scalable and secure AWS cloud infrastructure at Gartner. Lead automation and cloud strategy, ensuring operational excellence and mentoring junior engineers.
DevOps Engineer responsible for stable operations of infrastructure and software lifecycle in Collection Process Operations. Involvement in modernizing systems and continuous process automation.
Site Reliability Engineer improving reliability of cloud communications technology. Building monitoring solutions with a focus on operational readiness across Windows and Linux environments.
DevSecOps Engineer managing secure cloud infrastructure and automating CI/CD pipelines at CACI. Collaborating with teams to ensure compliance and implement security best practices.
Site Reliability Engineer developing resilient infrastructure for the Intelligence Community. Building redundancy, implementing monitoring tools, and automating tasks to improve systems.
DevSecOps & Platform Operations Lead designing and implementing cloud - native CI/CD pipelines for secure federal cloud modernization initiatives. Ensuring scalable, observable data platforms aligned with federal governance.