About the role

Data Scientist developing data extraction and machine learning pipelines for S&P Global. Building enterprise-scale data solutions and collaborating in a global team environment.

Responsibilities

Develop, deploy, and operate data extraction and automation pipelines in production
Integrate and deploy machine learning models into those pipelines (e.g., inference services, batch scoring)
Lead critical stages of the data engineering lifecycle, including: End-to-end delivery of complex extraction, transformation, and ML deployment projects
Scaling and replicating pipelines on AWS (EKS, ECS, Lambda, S3, RDS)
Designing and managing DataOps processes, including Celery/Redis task queues and Airflow orchestration
Implementing robust CI/CD pipelines on Azure DevOps (build, test, deployment, rollback)
Writing and maintaining comprehensive unit, integration, and end-to-end tests (pytest, coverage)
Strengthen data quality, reliability, and observability through logging, metrics, and automated alerts
Define and evolve platform standards and best practices for code, testing, and deployment
Document architecture, processes, and runbooks to ensure reproducibility and smooth hand-offs
Partner closely with data scientists, ML engineers, and product teams to align on requirements, SLAs, and delivery timelines

Expert proficiency in Python, including building extraction libraries and RESTful APIs
Hands-on experience with task queues and orchestration: Celery, Redis, Airflow
Strong AWS expertise: EKS/ECS, Lambda, S3, RDS/DynamoDB, IAM, CloudWatch
Containerization and orchestration: Docker (mandatory), basic Kubernetes (preferred)
Proven experience deploying ML models to production (e.g., SageMaker, ECS, Lambda endpoints)
Proficient in writing tests (unit, integration, load) and enforcing high coverage
Solid understanding of CI/CD practices and hands-on experience with Azure DevOps pipelines
Familiarity with SQL and NoSQL stores for extracted data (e.g., PostgreSQL, MongoDB)
Strong debugging, performance tuning, and automation skills
Openness to evaluate and adopt emerging tools and languages as needed
Bachelor's degree in Computer Science, Engineering, or related field (Good to have)
2-4 years of relevant experience in data engineering, automation, or ML deployment (Good to have)
Prior contributions on GitHub, technical blogs, or open-source projects (Good to have)
Basic familiarity with GenAI model integration (calling LLM or embedding APIs) (Good to have)

Health & Wellness: Health care coverage designed for the mind and body.
Flexible Downtime: Generous time off helps keep you energized for your time on.
Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills.
Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs.
Family Friendly Perks: It’s not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families.
Beyond the Basics: From retail discounts to referral incentive awards—small perks can make a big difference.