Site Reliability Engineer automating infrastructure and operations at DTEX Systems. Seeking candidates with strong software engineering background and experience in cloud environments.
Responsibilities
Design, write, and maintain software, primarily in Python, to automate the provisioning, deployment, and configuration management of our infrastructure
Contribute to the adoption and maturation of Terraform, establishing and maintaining best practices for state management, modularization, and version control
Utilize Ansible and/or Saltstack to ensure consistency, repeatability, and standardization across all environments
Develop robust CI/CD pipelines for both infrastructure and application deployments, replacing manual processes
Implement and mature monitoring, logging, and alerting systems to proactively improve system reliability
Participate in a “follow the sun” on-call rotation, focusing on sustainable incident response, blameless postmortems, and driving continuous improvement
Champion SRE principles, automation, and coding best practices within the team and across the organization
Requirements
3+ years of hands-on experience managing production environments in AWS and/or GCP.
Strong proficiency in Python.
Demonstrated ability to write clean, maintainable, and testable code to solve infrastructure problems.
Experience with Terraform, including best practices for state management and modular design in complex environments.
Strong knowledge of Linux internals and high competency in Bash scripting and command-line operations.
Proficiency with Ansible and/or Saltstack as configuration management tools.
Expert level understanding of Git and collaborative workflows, such as branching strategies and code review best practices.
MS/BS in Computer Science/Computer Engineering or related field of study (or equivalent experience).
IT Infrastructure Specialist managing physical and virtual server environments for Premier League Studios. Ensuring robust workflows and high - performance infrastructure in a hybrid work setting.
Manager of Platform Engineering at a leading insurance company shaping the future of API platforms. Fostering innovation and collaboration while driving platform stability and resiliency.
Infrastructure Engineer responsible for building, monitoring, and securing IT infrastructure for NLACRC. Collaborates with IT personnel and external support to ensure robust infrastructure.
Infrastructure Engineering Intern working on cloud solutions at a global growth engine for commerce. Collaborating on secure, scalable systems and contributing to performance optimization.
Infrastructure Engineer supporting IT service management and implementing complex system solutions. Collaborating with business units and training junior team members in a hybrid environment.
Infrastructure Engineering Lead overseeing edge security initiatives for Lloyds Banking Group. Driving the development of security capabilities and mentoring engineering teams.
Lead Infrastructure Engineer focusing on web access protection and security strategies at Lloyds Banking Group. Managing infrastructure improvements and team leadership in enterprise environments.
Senior Infrastructure Engineer maintaining IT infrastructure and datacentre operations for Walkers Global. Installing, configuring, and troubleshooting various hardware and cloud services in a hands - on role.
Infrastructure Architect responsible for designing and implementing multi - cloud infrastructures. Collaborating with teams to ensure high availability, security, and cost efficiency in cloud environments.
Senior Database Administrator specializing in private cloud technologies for fintech company's modernization agenda. Focused on database platform engineering with MS SQL and PostgreSQL.