Senior Site Reliability Engineer at Broadridge managing infrastructure design and operational support. Collaborating with teams to improve automation, performance, and reliability of services in a hybrid environment.
Responsibilities
Work within and across teams to design, develop, test, implement, and support technical solutions across a full-stack of development tools and technologies.
Translate business requirements into technical designs, considering automation, availability, performance, scale and cost.
Ensure technical & security best practices along with Broadridge standards are adhered to in the design of technical infrastructure
Participate in technical design sessions and works closely with multiple teams, including application development teams, infrastructure teams, vendors, and clients, if needed, to review the infrastructure designs for new projects.
Deliver high quality technical infrastructure, on-time, following Broadridge processes.
Automate the implementation and operational support of the infrastructure.
Provide estimates of all priority and non-priority projects along with recommended scope or schedule changes based on capacity and unforeseen challenges.
Participate in technical implementation to ensure the quality of the infrastructure, automation and the overall productivity of the SRE (Site Reliability Engineering) team.
Track Service Level Indicators (SLI) to ensure the health of technical infrastructure and Broadridge services.
Troubleshoot production issues affecting Broadridge services as needed, taking appropriate corrective actions.
Conduct preventative maintenance to ensure capacity, scaling, security and availability of Broadridge services.
Understand dependencies between infrastructure components, vendor software, custom software and other parts of the processing stack that support Broadridge Services
Collaborate with peers and other technical teams, such as development teams, architecture, database teams, storage teams, server teams, security teams to prevent and shorten production incidents.
Define Service Level Objectives (SLOs) for Broadridge Services
Implement additional operational improvements for automation, monitoring and incident management to increase the reliability of Broadridge services.
Guide more junior associates through established processes.
Requirements
Bachelor’s degree in computer science, Computer Engineering, or in a related field.
8+ years of experience with commercial service infrastructure at both a software and infrastructure level
Experience in managing datacenter hosted and AWS hosted application.
8+ years of experience within a programming and application system environment, with solid experience and a working knowledge in the following technologies: OS: Linux, Windows
Skills: Functional skills – System Design and Architecture, DevOps / Deployment automation, Troubleshooting, Service Monitoring.
Passionate teammate who understands and respects personal & cultural differences
Ability to work under pressure and be highly adaptable
Strong written and communications skills for collaboration with various teams and upper management
Solid analytical skills, especially in area of translating business requirements into technical design – with a continuous focus on aligning technical roadmap with the immediate and long-term Business strategy
Ability to adapt and embrace change and support business strategy and vision.
Knowledge of next-generation design patterns/architecture like micro-services, layered pattern, cloud.
Strong aptitude for learning new skills and new technologies.
Benefits
Please visit www.broadridgebenefits.com for more information on our comprehensive benefit offerings.
Job title
Senior Site Reliability Engineer – Hybrid, Flexible Options
DevSecOps Engineer building and maintaining Azure DevOps cloud applications with API backend. Roles include developing CI/CD pipeline and automating backend tasks.
Reliability Engineer II at Cargill applying technical expertise to enhance process and asset reliability. Collaborating with teams to execute engineering strategies for equipment optimization in a salt mine setting.
Reliability Engineer applying technical knowledge to enhance process and asset reliability. Partnering with teams to implement reliability excellence activities and predictive maintenance programs.
Cloud & DevOps Engineer designing and maintaining infrastructure as code in cloud environments. Collaborating on application development interacting with APIs and AI solutions.
Senior Business Systems Analyst assisting in PLM Dev Ops at Arthrex. Involves supporting automation in deployment, testing, and monitoring of PLM systems.
Principal Software Engineer leading DevSecOps strategies for automated delivery and security across product engineering. Innovating CI/CD pipelines and embedding security practices in software delivery.
DevSecOps Engineer responsible for embedding security controls in CI/CD at Keyloop. Collaborate with engineering teams to integrate security in build and deployment workflows.
DevOps Engineer modernizing infrastructure for a fintech company focused on empowering e - commerce businesses. Engaging in hands - on work with GCP and Kubernetes to establish reliable, efficient deployment pipelines.
DevSecOps Engineer supporting AI - enabled financial compliance initiative for the Department of War. Responsible for designing secure infrastructure and collaborating with cross - disciplinary teams.
Senior DevOps Engineer managing monitoring systems for B2B e - commerce platforms in Azure Cloud. Collaborating with teams to improve platform products and processes.