AI Data Pipeline Engineer designing and operating high-throughput systems for petabyte-scale data delivery. Collaborating across teams to ensure data flows into AI workloads efficiently.
Responsibilities
Design and build high-performance, scalable data pipelines to support diverse AI and Machine Learning initiatives across the organization.
Architect and implement multi-region data infrastructure to ensure global data availability and seamless synchronization.
Develop flexible pipeline architectures that allow for complex branching and logic isolation to support multiple concurrent AI projects.
Optimize large-scale data processing workloads using Databricks and Spark to maximize throughput and minimize processing costs.
Maintain and evolve the containerized data environment on Kubernetes, ensuring robust and reliable execution of data workloads.
Collaborate with AI researchers and platform teams to streamline the flow of high-quality data into training and evaluation pipelines.
Requirements
Extensive professional experience in building and operating production-grade data pipelines for massive-scale AI/ML datasets.
Strong proficiency in distributed processing frameworks, particularly Apache Spark and the Databricks ecosystem.
Deep hands-on experience with workflow orchestration tools like Apache Airflow for managing complex dependency graphs.
Solid understanding of Kubernetes and containerization for deploying and scaling data processing components.
Proficiency in distributed messaging systems such as Apache Kafka for high-throughput data ingestion and event-driven architectures.
Expert-level programming skills in Python for system-level optimizations.
Strong knowledge of cloud-native services and best practices for building secure and scalable data infrastructure.
Logical approach to problem-solving with the persistence to identify and resolve root causes in complex, large-scale systems.
Strong communication skills to effectively collaborate with cross-functional teams and external partners.
Benefits
이력서 제출 시 주민등록번호, 가족관계, 혼인 여부, 연봉, 사진, 신체조건, 출신 지역 등 채용절차법상 요구 금지된 정보는 제외 부탁드립니다.
모든 제출 파일은 30MB 이하의 PDF 양식으로 업로드를 부탁드립니다. (이력서 업로드 중 문제가 발생한다면 지원하시고자 하는 포지션의 URL과 함께 이력서를 [email protected]으로 전송 부탁드립니다.)
AWS Data Engineer role focusing on AWS technologies in Gurugram, Haryana, India. Responsibilities include AWS data engineering tasks and collaboration with team members.
Data Engineer implementing innovative technology for various domains at Quantexa. Building data pipelines and providing insights to help clients solve complex business problems.
Principal Consultant Data Architecture leading complex Data and Analytics projects in a hybrid environment. Responsible for designing enterprise data architectures and mentoring technical teams.
Consultant / Senior Consultant in Data Engineering & Data Science contributing to data solutions. Collaborating with cross - functional teams in a hybrid environment in Germany.
Senior Data Engineer managing data platform strategy and analytics architecture at HALOS scaleup company. Owning design and implementation of analytical data platform.
Data Engineer building trusted data platforms for decision making at Lyrebird Health. Collaborating with teams to develop and maintain data pipelines and analytics - ready tables.
Data Engineer at leading online insurance platform for businesses. Delivering data pipelines and collaborating cross - functionally to enhance decision - making.
Senior Data Engineer developing and maintaining products for dentsu’s marketing using cloud technologies. Collaborating with teams to solve complex data problems and ensuring driving business value.
Senior Data Engineer responsible for designing and developing data products and solutions at Lloyds Banking Group. Collaborating with various teams to improve regulatory data processes and mentoring other engineers.
Senior Data Engineer responsible for architecting analytics solutions for SAP and Azure at Swiss Re. Collaborating with business stakeholders and mentoring junior engineers in a dynamic finance landscape.