Infrastructure Management: Design, implement, and manage cloud-based infrastructure using AWS to support the company's applications and services.
Continuous Integration and Deployment (CI/CD): Establish and maintain robust CI/CD pipelines to automate the build, test, and deployment processes, enabling rapid and reliable software delivery.
Monitoring and Alerting: Develop and maintain monitoring solutions to ensure the availability, performance, and security of applications, services, and infrastructure. Set up proactive alerts to promptly identify and resolve potential issues.
Security and Compliance: Implement security best practices and maintain compliance with industry standards to safeguard sensitive data and protect against security threats.
Configuration Management: Employ configuration management tools to automate the setup and management of various environments, ensuring consistency across development, staging, and production.
Collaboration and Communication: Work closely with cross-functional teams, including developers, system administrators, and QA engineers, to foster a collaborative and productive DevOps culture.
Troubleshooting and Incident Response: Investigate and resolve production issues promptly, applying root cause analysis techniques to prevent recurrences and improve overall system stability.
Performance Optimization: Identify and address performance bottlenecks in applications and infrastructure to enhance system responsiveness and efficiency.
Requirements
Bachelor's degree in Computer Science, Software Engineering, or related field (or equivalent practical experience).
Proven experience as a DevOps Engineer or in a similar role.
Strong knowledge of cloud platforms (AWS) and containerization technologies (e.g., Kubernetes).
Experience with GitOps -> ArgoCD for continuous delivery
Proficiency in scripting and automation using tools like Bash, Python, or PowerShell.
Solid understanding of CI/CD concepts and experience with CI/CD tools (e.g., Tekton, Jenkins, GitLab CI, CircleCI).
Familiarity with monitoring and logging tools (e.g., Datadog "preferred", Prometheus, Grafana, ELK stack).
Knowledge of version control systems (e.g., Git).
Strong problem-solving skills and ability to work effectively in a fast-paced, agile development environment.
Excellent communication and teamwork skills.
**Preferred Skills:**
Relevant certifications like AWS Solutions Architect.
Experience with infrastructure-as-code (IaC) tools such as Terraform or CloudFormation.
Familiarity with database administration and management.
DevOps Analyst providing high quality and reliable solutions within multifuncional teams at technology - focused financial organization. Automating build and deployment solutions in a hybrid work environment.
Network & Datacenter Deployment Engineer at Cloudflare focused on building and expanding their global network infrastructure with collaboration across multiple engineering teams and vendors.
Senior DevOps Engineer leading cloud - native solutions at Sparksoft Corporation. Driving automation and system reliability within a fast - paced Agile team.
Platform Engineer focusing on supporting CI/CD pipelines and Kubernetes at PCCW. Responsible for ensuring platform services' reliability and performance, with night - time support as needed.
Site Reliability Engineer at Bumble optimizing large - scale Linux environments and ensuring system stability. Focusing on troubleshooting, incident recovery, and performance tuning in complex infrastructures.
DevOps Manager overseeing engineering team developing scalable CI/CD processes for NVIDIA Networking products. Enhancing global R&D efficiency in a technology - focused company.
Senior DevOps Manager overseeing CI/CD processes for NVIDIA Networking products. Leading a team and collaborating with global teams to enhance R&D efficiency and infrastructure.
Join Operations Team as Senior Site Reliability Engineer driving operational excellence for cybersecurity solutions. Collaborate across teams to manage production platforms and optimize infrastructure.