Research Engineer focusing on decentralized AI training stack for Prime Intellect. Engaging in novel research, optimizing workloads, and contributing to open-source frameworks.
Responsibilities
Lead and participate in novel research to build a massive scale, highly reliable and secure decentralized training orchestration solution
Optimize the performance, cost, and resource utilization of AI workloads by leveraging the most recent advances for compute & memory optimization techniques.
Contribute to the development of our open-source libraries and frameworks for distributed model training.
Publish research in top-tier AI conferences such as ICML & NeurIPS.
Distill highly technical project outcomes in layman approachable technical blogs to our customers and developers.
Stay up-to-date with the latest advancements in AI/ML infrastructure and tools, decentralized training research and proactively identify opportunities to enhance our platform's capabilities and user experience.
Requirements
Strong background in AI/ML engineering, with extensive experience in designing and implementing end-to-end pipelines for training and deploying large-scale AI models.
Deep expertise in distributed training techniques, frameworks (e.g., PyTorch Distributed, DeepSpeed, MosaicML’s LLM Foundry), and tools (e.g. Ray) for optimizing the performance and scalability of AI workloads.
Experience in large-scale model training incl. distributed training techniques such as data, tensor & pipeline parallelism
Solid understanding of MLOps best practices, including model versioning, experiment tracking, and continuous integration/deployment (CI/CD) pipelines.
Passion for advancing the state-of-the-art in decentralized AI model training and democratizing access to AI capabilities for researchers, developers, and businesses worldwide.
If you're not familiar with these, but feel like that you can contribute to our mission and you're a high-energy person, get familiar with these resources (here, here and here) and please reach out!
Benefits
Competitive compensation, including equity incentives, aligning your success with the growth and impact of Prime Intellect.
Flexible work arrangements, with the option to work remotely or in-person at our offices in San Francisco.
Visa sponsorship and relocation assistance for international candidates.
Quarterly team off-sites, hackathons, conferences and learning opportunities.
Opportunity to work with a talented, hard-working and mission-driven team, united by a shared passion for leveraging technology to accelerate science and AI.
Research Engineer developing agentic systems at Anthropic focused on LLMs and AI applications. Collaborating with researchers to enhance agent performance and tackle complex tasks.
System Modelling Innovation Engineer at Electrolux developing advanced product development system models. Enhancing modeling techniques and optimizing product development for better consumer experiences.
R&D Engineer developing estimation and control strategies for Electrolux appliances. Collaborating with global teams to innovate product features and drive sustainability in consumer electronics.
Principal Research Engineer leading engineering activities in behavior autonomy for Scientific Systems. Overseeing critical technology deliverables, team management, and proposal efforts.
Staff Research Engineer involved in creating a neurosymbolic AI agent at Onton. Focused on optimal decision - making processes and addressing challenges in current AI systems.
Post - Training Research Engineer at Baseten developing tooling for efficient AI model training. Collaborating on diverse architectures and systems - level concepts to enhance performance in AI applications.
AI Data Innovation Engineer developing and validating AI capabilities tied to governed enterprise data products at U.S. Bank. Collaborating on AI readiness efforts and supporting data product initiatives.
Research Engineer at Yooz, specializing in AI - driven document automation. Collaborating with R&D to develop innovative technologies and enhance document management solutions.
System Test & Research Engineer developing testing protocols and supporting improvements in Precision Agriculture solutions at Topcon. Collaborating with teams to ensure product quality and performance.
Senior Research Engineer developing mechanical designs for engine demonstrators at GKN Aerospace. Leading technology integration and collaborating across engineering disciplines in aeronautics.