Big Data Engineer optimizing scalable data solutions using Hadoop, PySpark, and Hive at Citi. Responsible for building ETL pipelines and ensuring data quality in a hybrid work environment.
Responsibilities
Design, develop, and maintain efficient and scalable Big Data solutions using PySpark, Apache Hive, and Hadoop ecosystem tools
Implement and optimize ETL processes and data warehousing solutions
Conduct in-depth data analysis and troubleshoot complex data issues
Optimize Big Data workflows, including Spark job tuning and Hive query optimization
Perform rigorous unit testing and validation of data pipelines
Collaborate with data scientists, analysts, and other engineers
Requirements
Extensive experience in designing, developing, and optimizing scalable data solutions using the Hadoop ecosystem
Strong focus on PySpark and Hive
Strong Python knowledge
Implement and optimize ETL (Extract, Transform, Load) processes and data warehousing solutions
Conduct in-depth data analysis
Optimize Big Data workflows
Perform rigorous unit testing
Collaborate with data scientists, analysts, and other engineers
Lead Data Engineer overseeing data operations and analytics engineering teams for OneOncology. Focused on operational excellence in data platform and model reliability for cancer care improvement.
Senior AWS Software Data Engineer at Boeing focusing on AWS Data services to support digital analytics capabilities. Collaborating with cross - functional teams to design, develop, and maintain software data solutions.
Senior Data Engineer designing and improving software for business capabilities at Barclays. Collaborating with teams to build a data and intelligence platform for Equity Derivatives.
Senior AI & Data Engineer developing and implementing AI solutions in collaboration with clients and teams. Working on projects involving generative AI, predictive analytics, and data mastery.
Consultant driving IA business growth in Deloitte's Artificial Intelligence & Data team. Delivering innovative solutions using data analytics and automation technologies.
Data Engineer responsible for managing data architecture and pipelines at Snappi, a neobank. Collaborating with teams to enable data processing and analysis in innovative banking solutions.
Data Engineer at Destinus developing the data platform to support production and analytics needs. Involves migrating Excel sources to Lakehouse and integrating ERP systems in a hybrid role.
Senior Data Engineer developing solutions within the Global Specialty portfolio at an insurance company. Engaging with diverse business partners to ensure high quality data reporting.
Data Engineer at UBDS Group focusing on designing and optimizing modern data platforms. Collaborating in a multidisciplinary team to develop reliable data assets for analytics and operational use cases.
Data Consultant at SDG Group specializing in Data & Analytics projects. Collaborate on technical - functional definitions, ETL, data modeling, and visualization for cloud solutions.