Principal Machine Learning Ops Engineer developing scalable ML infrastructure and analytics platforms for financial services. Collaborating with Data Scientists to drive model deployment and optimization.
Responsibilities
As a Principal Machine Learning Ops Engineer within the Enterprise Data Science Platform team, you will create frameworks to support large-scale ML infrastructure and pipelines, including tools for the containerization and deployment of ML models
Collaborating with Data Scientists, you will develop advanced analytics and machine learning platforms to enable the prediction and optimization of models
You will extend existing ML platforms for scaling model training and deployment, and partner with various business and engineering teams to drive the adoption and integration of model outputs
This role is essential in leveraging Data Science to deliver exceptional customer experiences in financial services
Requirements
Has bachelor’s or master’s Degree in a technology related field (e.g. Engineering, Computer Science, etc.)
8+ years of proven experience in developing and implementing Python-based cloud applications and/or machine learning solutions
2+ years of experience in developing ML infrastructure and MLOps in the Cloud using AWS Sagemaker
5+ years of experience in building cloud-native applications using a range of AWS services, including but not limited to SageMaker AI, Bedrock, S3, CloudFormation (CFT), SNS, SQS, Lambda, AWS Batch, Step Functions, EventBridge, and CloudWatch
Familiarity with both Azure Cognitive Services, particularly for deploying OpenAI models, and Google Compute Vertex is beneficial
Extensive experience working with machine learning models with respect to deployment, inference, tuning, and measurement required
Experience in Object Oriented Programming (Java, Scala, Python), SQL, Unix scripting or related programming languages and exposure to some of Python’s ML ecosystem (numpy, panda, sklearn, tensorflow, etc.)
Experience with building data pipelines in getting the data required to build and evaluate ML models, using tools like Apache Spark or other distributed data processing frameworks
Data movement technologies (ETL/ELT), Messaging/Streaming Technologies (AWS SQS, Kinesis/Kafka), Relational and NoSQL databases (DynamoDB, EKS, Graph database), API and in-memory technologies
Strong knowledge of developing highly scalable distributed systems using Open-source technologies
Strong experience with CI/CD tools, particularly Jenkins, for automating and streamlining the software development pipeline
Proficient in using version control systems like Git for effective code management and collaboration
Hands-on experience with containerization technologies such as Docker for building and deploying applications
Expertise in infrastructure as code (IaC) services, including AWS CloudFormation and tools like Terraform or OpenTofu, for managing and provisioning cloud resources
Solid experience in Agile methodologies (Kanban and SCRUM)
Benefits
comprehensive health care coverage and emotional well-being support
market-leading retirement
generous paid time off and parental leave
charitable giving employee match program
educational assistance including student loan repayment, tuition reimbursement, and learning resources to develop your career
Senior Software Developer working on ML Infrastructure and Deployment at Verafin. Helping develop cutting - edge fraud detection tools alongside analytics teams using AWS and Terraform.
Machine Learning Engineer developing advanced SLAM systems for autonomous trucking environments at Bot Auto. Collaborating with cross - functional teams to optimize mapping solutions and ensure operational stability.
Graduate Deep Learning Algorithm Developer developing perception technologies for autonomous driving. Tackling challenges in object detection and 3D perception using state - of - the - art deep learning models.
Principal AI/ML Engineer leading the AI/ML infrastructure development for WEX's risk service needs. Focused on innovative engineering and technology solutions within a high - stakes environment.
AI/ML Engineer developing solutions in artificial intelligence for HPE. Responsible for conducting research, designing AI solutions, and mentoring team members.
Machine Learning Engineer focusing on modeling cancer cells and developing related tools. Collaborating with researchers and scientists to advance cancer treatment through ML.
Machine Learning Engineer II developing production - grade ML models for fraud detection at GEICO. Collaborating on system architecture and ensuring optimal performance of fraud assessment systems.
AI/ML Engineer III designing and architecting AI solutions at Hewlett Packard Enterprise. Collaborating with teams to drive innovation and tackle complex problems.
AI/ML Engineer deploying state - of - the - art AI models to solve real - world problems at Brain Co. Working in healthcare, government, and energy sectors for impactful results.
Trainer at WeAndTheMany facilitating learning by sharing experiences and creating interactive sessions. Engaging with students to enhance their skills and knowledge through dynamic teaching methods.