About the role

  • Data Scientist developing data extraction and machine learning pipelines for S&P Global. Building enterprise-scale data solutions and collaborating in a global team environment.

Responsibilities

  • Develop, deploy, and operate data extraction and automation pipelines in production
  • Integrate and deploy machine learning models into those pipelines (e.g., inference services, batch scoring)
  • Lead critical stages of the data engineering lifecycle, including: End-to-end delivery of complex extraction, transformation, and ML deployment projects
  • Scaling and replicating pipelines on AWS (EKS, ECS, Lambda, S3, RDS)
  • Designing and managing DataOps processes, including Celery/Redis task queues and Airflow orchestration
  • Implementing robust CI/CD pipelines on Azure DevOps (build, test, deployment, rollback)
  • Writing and maintaining comprehensive unit, integration, and end-to-end tests (pytest, coverage)
  • Strengthen data quality, reliability, and observability through logging, metrics, and automated alerts
  • Define and evolve platform standards and best practices for code, testing, and deployment
  • Document architecture, processes, and runbooks to ensure reproducibility and smooth hand-offs
  • Partner closely with data scientists, ML engineers, and product teams to align on requirements, SLAs, and delivery timelines

Requirements

  • Expert proficiency in Python, including building extraction libraries and RESTful APIs
  • Hands-on experience with task queues and orchestration: Celery, Redis, Airflow
  • Strong AWS expertise: EKS/ECS, Lambda, S3, RDS/DynamoDB, IAM, CloudWatch
  • Containerization and orchestration: Docker (mandatory), basic Kubernetes (preferred)
  • Proven experience deploying ML models to production (e.g., SageMaker, ECS, Lambda endpoints)
  • Proficient in writing tests (unit, integration, load) and enforcing high coverage
  • Solid understanding of CI/CD practices and hands-on experience with Azure DevOps pipelines
  • Familiarity with SQL and NoSQL stores for extracted data (e.g., PostgreSQL, MongoDB)
  • Strong debugging, performance tuning, and automation skills
  • Openness to evaluate and adopt emerging tools and languages as needed
  • Bachelor's degree in Computer Science, Engineering, or related field (Good to have)
  • 2-4 years of relevant experience in data engineering, automation, or ML deployment (Good to have)
  • Prior contributions on GitHub, technical blogs, or open-source projects (Good to have)
  • Basic familiarity with GenAI model integration (calling LLM or embedding APIs) (Good to have)

Benefits

  • Health & Wellness: Health care coverage designed for the mind and body.
  • Flexible Downtime: Generous time off helps keep you energized for your time on.
  • Continuous Learning: Access a wealth of resources to grow your career and learn valuable new skills.
  • Invest in Your Future: Secure your financial future through competitive pay, retirement planning, a continuing education program with a company-matched student loan contribution, and financial wellness programs.
  • Family Friendly Perks: It’s not just about you. S&P Global has perks for your partners and little ones, too, with some best-in class benefits for families.
  • Beyond the Basics: From retail discounts to referral incentive awards—small perks can make a big difference.

Job title

Data Scientist

Job type

Experience level

JuniorMid level

Salary

Not specified

Degree requirement

Bachelor's Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job