Senior DevOps Engineer leading cloud-native solutions at Sparksoft Corporation. Driving automation and system reliability within a fast-paced Agile team.
Responsibilities
Lead DevOps initiatives and drive innovative automation and cloud-native solutions to enhance system efficiency, reliability, and scalability.
Collaborate daily within a fast-paced Agile DevOps team and actively participate in all phases of the Agile methodology.
Implement Infrastructure as Code (IaC) using Terraform/Ansible across AWS services.
Foster an automation-driven culture to reduce manual tasks and increase operational efficiency.
Design, develop, and optimize CI/CD pipelines using Jenkins, GitHub, and container-based platforms.
Use F5 LTM/APM to configure traffic routing, access control, and load balancing rules.
Write and troubleshoot F5 iRules for traffic management and integration use cases.
Ensure adherence to all CMS security, compliance, and governance requirements.
Support application operations, participate in the on-call rotation, and ensure high availability and system reliability.
Execute patches, upgrades, deployments, and continuity-of-operations activities across environments.
Maintain comprehensive documentation, including architectural diagrams, operational runbooks, and technical procedures.
Engage with architects and product teams to understand functional challenges, prototype new ideas and technologies, and help create innovative solutions.
Participate in defining project timelines and support the implementation of design specifications, system flow diagrams, documentation, testing, and ongoing application support.
Translate business and functional requirements into technical specifications and implement solutions that align with architectural and program objectives.
Mentor junior engineers, provide technical leadership, and champion best practices in DevSecOps, automation, performance engineering, and operational excellence.
Requirements
Strong expertise in AWS cloud infrastructure including EC2, ECS/EKS, Lambda, DynamoDB, SQS, SNS, S3, SES, Route53, EFS, VPC, ELB/NLB, AWS Serverless, AWS KMS, and AWS automation/configuration best practices.
Hands-on experience with F5 BIG-IP (LTM/APM) including configuring load balancing profiles, managing access policies, and writing/debugging iRules.
Proficiency with Infrastructure as Code (IaC) using Terraform and Ansible, including automation and environment provisioning.
Extensive experience with containerization and orchestration, including Docker, Kubernetes, and CI/CD pipelines using Jenkins, GitHub, and Bitbucket.
Strong Linux/RedHat administration skills with the ability to troubleshoot system-level and application-level issues.
Experience with microservices, API gateways and authentication/authorization standards (Okta, OIDC, SAML).
Ability to debug applications built in Java, Node.js, and Python from an infrastructure/DevOps perspective.
Experience working with relational and NoSQL databases, including PostgreSQL, RDBMS platforms, and DynamoDB.
Proficiency with observability and monitoring tools such as Splunk, New Relic, Datadog, AWS DevOps Guru, and AWS Forecast.
Ability to monitor, manage, and optimize system resources and performance.
Strong problem-solving, analytical, and communication skills with the ability to work independently and collaboratively in large enterprise environments.
Experience working in Agile SAFe environments and using collaboration tools such as Jira and Slack.
Self-directed, motivated, and capable of leading or supporting complex technical initiatives.
Candidates must be able to obtain and maintain a Public Trust clearance
Candidates must have lived in the United States 3 out of the past 5 years.
Benefits
Competitive compensation and a 401(k) with employer contributions to help you plan for the future
Flexible paid time off and hybrid ways of working that support true work-life balance
Comprehensive health coverage—including medical, dental, vision, life, and disability insurance
A curated in-office experience designed to foster community, team connections, and innovation
Opportunities to give back through Sparksoft Cares, including annual company-wide fundraising events
Training and development programs that build new skills and prepare you for leadership roles
A collaborative, transparent, and fun culture—recognized as a Great Place to Work®
Senior DevOps Engineer responsible for cloud infrastructure and deployments. Optimizing AWS services and ensuring system security and reliability for Verizon.
Senior DevOps Engineer responsible for automating infrastructure and building CI/CD pipelines for collaborative robotics company. Collaborating with global engineering teams from the Bangalore office.
Site Reliability Engineer Intern at Tencent working on gaming services and cloud native solutions. Collaborating with global teams to eliminate toil and enhance reliability.
Cloud/DevOps Specialist at N5X managing and optimizing critical cloud infrastructures for Brazilian energy trading. Collaborating with a multidisciplinary team to ensure high availability and performance.
Cloud/Devops Specialist responsible for designing a hybrid architecture combining cloud and on - premises infrastructure for energy trading systems. Collaborating with a multidisciplinary team in a dynamic environment.
Reliability Engineering Specialist utilizing reliability tools and models to improve asset performance at Enbridge. Collaborating across teams to guide investment decisions for safe operations.
DevOps Engineer responsible for structuring and supporting cloud DevOps architecture in Brazil. Working strategically on automation and CI/CD practices with development teams in Pernambuco.
DevSecOps Software Engineer developing secure CI/CD pipelines for Boeing's military software systems. Collaborate with cross - functional teams and implement automation and security best practices.
DevOps Manager responsible for managing a team for multi - cloud solutions supporting the USAF Cloud One project. Focus on scalable cloud - native solutions and CI/CD practices.
Lead Site Reliability Engineer overseeing SRE practices across Azure and GCP platforms. Driving reliability improvements and leading a team at Lloyds Banking Group.