Site Reliability Engineer at Equifax ensuring reliability and performance of distributed fault-tolerant systems. Collaborating with teams to build cost-effective systems with high uptime metrics.
Responsibilities
Work in a DevSecOps environment responsible for the building and running of large-scale, massively distributed, fault-tolerant systems.
Work closely with development and operations teams to build highly available, cost effective systems with extremely high uptime metrics.
Work with cloud operations team to resolve trouble tickets, develop and run scripts, and troubleshoot
Create new tools and scripts designed for auto-remediation of incidents and establishing end-to-end monitoring and alerting on all critical aspects
Build infrastructure as code (IAC) patterns that meets security and engineering standards using one or more technologies (Terraform, scripting with cloud CLI, and programming with cloud SDK).
Participate in a team of first responders in a 24/7, follow the sun operating model for incident and problem management.
Requirements
BS degree in Computer Science or related technical field involving coding (e.g., physics or mathematics), or equivalent job experience required
2-5 years of experience in software engineering, systems administration, database administration, and networking.
1+ years of experience developing and/or administering software in public cloud
Experience in monitoring infrastructure and application uptime and availability to ensure functional and performance objectives.
Experience in languages such as Python, Bash, Java, Go JavaScript and/or node.js
Demonstrable cross-functional knowledge with systems, storage, networking, security and databases
System administration skills, including automation and orchestration of Linux/Windows using Terraform, Chef, Ansible and/or containers (Docker, Kubernetes, etc.)
Proficiency with continuous integration and continuous delivery tooling and practices
Cloud Certification Strongly Preferred
Benefits
Hybrid work setting
Comprehensive compensation and healthcare packages
Attractive paid time off
Organizational growth potential through online learning platform with guided career tracks
DevSecOps Engineer responsible for enhancing Thales' secure hosting platforms in public and private clouds. Collaborating with teams to apply modern practices and build resilient infrastructures.
Develops high - automation services in Golang or Java within AWS, Kubernetes, and Azure. Supports teams in building secure applications while working in a hybrid environment.
DevOps Engineer specializing in AWS Cloud Infrastructure in a hybrid position. Collaborating within a supportive team to build modern infrastructure for VM - based applications.
Leading DevOps platform strategy for KIPMI Software's next - generation digital trust products. Collaborating with teams to implement scalable infrastructure and DevSecOps practices.
Join our DevOps team to build and manage GitHub pipelines and cloud - native Azure solutions. Collaborate with teams to drive DevOps best practices and optimize deployments.
Site Reliability Engineer enhancing system reliability and deployment practices at OpenLoop. Collaborating with cross - functional teams for incident management and performance tuning.
Senior DevOps Engineer enhancing Azure application reliability for a healthcare fintech platform. Collaborating closely with engineering teams to ensure deploy safety and observability.
DevOps Engineer contributing to tooling changes and leading a community of practice at Totara. Focused on collaboration, development, and support for internal teams.
Site Reliability Engineer responsible for infrastructure supporting AI platform. Safeguarding US customer data and ensuring compliance in the Aerospace and Defense sector.