Senior DevOps managing data platform workloads on AWS. Collaborating on Data Mesh architecture and optimizing data pipelines in a hybrid work environment.
Responsibilities
Manage, capacity plan, and operate workloads utilizing EC2 clusters via DataBricks/EMR to ensure efficient and reliable data processing
Collaborate with stakeholders to design and implement a Data Mesh architecture across multiple closely related but separate enterprise entities
Utilize Infrastructure as Code (IaC) tools such as CloudFormation or Terraform to define and manage data platform user access to data and compute resources.
Implement role-based access control (RBAC) mechanisms using IaC templates to enforce least privilege principles and ensure secure access to data and compute resources
Collaborate with cross-functional teams to design, implement, and optimize data pipelines and workflows
Utilize distributed engines such as Spark to process and analyze large volumes of data efficiently when required
Develop and maintain operational best practices for Spark and other data warehousing tools to ensure system stability and performance
Implement and manage storage technologies to efficiently store and retrieve data as per business requirements
Troubleshoot and resolve platform-related issues in a timely manner to minimize downtime and disruptions
Stay updated on emerging technologies and industry trends to continuously enhance the data platform infrastructure
Document processes, configurations, and changes to ensure comprehensive system documentation.
Requirements
Knowledge of one or more of the following: **AWS CloudFormation and Terraform **for infrastructure provisioning
Knowledge of the source control and its related concepts (**Gitlab/Git flow, Trunk-based, branches,** etc.).
Familiarity with at least one programming language (**Python, Bash, **etc.).
Familiarity with a distributed compute engine such as** Spark**
Familiarity with a data platform or data orchestration tool such as **Databricks/Airflow**
Equipped with in-depth working knowledge and experience in using **AWS IAM, VPC, EC2, RDS, DynamoDB, DMS,** and **S3**
Experience with CI/CD tools (such as **Jenkins, TeamCity, AWS CodePipeline, CodeDeploy**) or configuration management tools (such as Ansible, Chef, Puppet..)
DevOps mindset with automation and operational excellence in mind
Good skills in English and the ability to communicate effectively with business and technical teams
Demonstrate good logical thinking and problem-solving skills
**Be curious and have a self-learning attitude**
Big Plus:
AWS Data Engineer Associate or DevOps Professional Certifications
You are:
Passionate about technology
Independent but also a team player
Comfortable with a high degree of ambiguity
Focused on usability and speed
Keen on presenting your ideas to your peers and management.
Benefits
Meal and parking allowances are covered by the company.
Full benefits and salary rank during probation.
Insurances such as Vietnamese labor law and premium health care for you and your family.
SMART goals and clear career opportunities (technical seminar, conference, and career talk) - we focus on your development.
Values-driven, international working environment, and agile culture.
Overseas travel opportunities for training and work-related.
Internal Hackathons and company events (team building, coffee run, etc.).
Pro-Rate and performance bonus.
15-day annual + 3-day sick leave per year from the company.
Join a Data Engineering Team as a Senior DevOps to support multiple Data & AI initiatives. Utilize cloud technologies and enhance data pipelines in a collaborative environment.
Principal Site Reliability Engineer at Early Warning designing performance and resiliency patterns for applications and infrastructure. Collaborating with development teams to improve systems and data integrity.
DevOps Engineer contributing to CI/CD setup and Azure services management. Collaborates with teams to ensure efficient project delivery in a hybrid environment.
IT DevOps Specialist at BMW responsible for analyzing requirements and implementing software solutions in AWS cloud environments. Collaborating internationally within agile teams for digital transformation projects.
DevOps Engineer at Vistra designing, implementing, and maintaining robust CI/CD pipelines and cloud infrastructure. Enabling software delivery across multiple technology stacks with a focus on AWS.
Manage complex customer rollouts and initial system deployments at Talex.ai. Bridging technical development with real - world application in robotics and AI systems.
Cloud Operations Engineer designing and implementing highly reliable cloud solutions. Leading cloud infrastructure initiatives for production operations and customer success in a growing team.
Quality Engineer supporting new product launches and reliability testing for SSD at Micron in Malaysia. Responsible for coordinating test activities and conducting failure analysis.
Reliability Engineer ensuring operational readiness of data centers at Rowan Digital Infrastructure. Overseeing commissioning, operational standards, and transitioning facilities into live operations.
Manager of Mechanical Engineering ensuring high - availability mechanical systems in data centers. Collaborating on lifecycle management and performance evaluation across missions - critical facilities in a hybrid role.