Foundational AI Research Scientist developing next-generation language models. Pioneering large-language-model architectures and attention mechanisms for efficient scaling.
Responsibilities
Research and prototype sub-quadratic attention architectures to unlock efficient scaling of large language models.
Design and evaluate efficient attention mechanisms including state-space models (e.g., Mamba), linear attention variants, and sparse attention patterns.
Lead pre-training initiatives across a range of model scales from 1B to 100B+ parameters.
Conduct rigorous experiments measuring the efficiency, performance, and scaling characteristics of novel architectures.
Collaborate closely with product and engineering teams to integrate models into production systems.
Stay at the forefront of foundational research and help shape Aldea's long-term model roadmap.
Requirements
Requires a Ph.D. in Computer Science, Engineering, or related field.
3+ years of relevant industry experience.
Deep understanding of modern sequence modeling architectures including State Space Models (SSMs), Sparse Attention mechanisms, Mixture of Experts (MoE), and Linear Attention variants.
Hands-on experience pre-training large language models across a range of scales (1B+ parameters).
Expertise in PyTorch, Transformers, and large-scale deep-learning frameworks.
Proven ability to design and evaluate complex research experiments.
Demonstrated research impact through patents, deployed systems, or core-model contributions.
Nice to Have Experience with distributed training frameworks and multi-node optimization.
Knowledge of GPU acceleration, CUDA kernels, or Triton optimization.
Publication record in top-tier ML venues (NeurIPS, ICML, ICLR) focused on architecture research.
Experience with model scaling laws and efficiency-performance tradeoffs.
Background in hybrid architectures combining attention with alternative sequence modeling approaches.
Familiarity with training stability techniques for large-scale pre-training runs.
Benefits
Competitive base salary
Performance-based bonus aligned with research and model milestones
Staff/Senior Research Scientist driving development of AI frontier benchmarks and datasets at Snorkel AI. Collaborating across teams to scale production and impact research community.
Research Scientist focusing on reinforcement learning for training large language models at Snorkel AI. Collaborating with research and engineering teams to advance RL data capabilities.
Research Scientist conducting original research in AI and machine learning for Spotify's Personalization team. Developing methodologies and collaborating with partners to advance personalization systems.
Research Scientist at Valence Labs developing ML models for predicting cellular responses in drug discovery. Building generative models based on massive multiomics datasets with collaborative research.
Research Assistant responsible for statistical data analysis under a Principal Investigator for health science projects. Involves data management, statistical analyses, and documentation of workflows.
Temporary Research Assistant supporting data collection for Medical Ethics at University of Pennsylvania. Engaging in programming activities and organizing research - related documentation.
Machine Learning Research Scientist conducting applied AI/ML research at SEI. Developing prototype capabilities for government workflows with a focus on mission context.
Summer Research Assistant helping with research and field work in agroforestry systems. Collecting data and assisting in various tasks based in Wisconsin.
R&D Scientist at IFF developing innovative flavor and fragrance delivery technologies. Conducting research and experiments to enhance consumer experiences in everyday products.
Postdoctoral Research Fellow involved in legal research for sustainable textile value chains. Collaborating with researchers on a funded project for law and circular economy.