Foundational AI Research Scientist developing next-generation language models. Pioneering large-language-model architectures and attention mechanisms for efficient scaling.
Responsibilities
Research and prototype sub-quadratic attention architectures to unlock efficient scaling of large language models.
Design and evaluate efficient attention mechanisms including state-space models (e.g., Mamba), linear attention variants, and sparse attention patterns.
Lead pre-training initiatives across a range of model scales from 1B to 100B+ parameters.
Conduct rigorous experiments measuring the efficiency, performance, and scaling characteristics of novel architectures.
Collaborate closely with product and engineering teams to integrate models into production systems.
Stay at the forefront of foundational research and help shape Aldea's long-term model roadmap.
Requirements
Requires a Ph.D. in Computer Science, Engineering, or related field.
3+ years of relevant industry experience.
Deep understanding of modern sequence modeling architectures including State Space Models (SSMs), Sparse Attention mechanisms, Mixture of Experts (MoE), and Linear Attention variants.
Hands-on experience pre-training large language models across a range of scales (1B+ parameters).
Expertise in PyTorch, Transformers, and large-scale deep-learning frameworks.
Proven ability to design and evaluate complex research experiments.
Demonstrated research impact through patents, deployed systems, or core-model contributions.
Nice to Have Experience with distributed training frameworks and multi-node optimization.
Knowledge of GPU acceleration, CUDA kernels, or Triton optimization.
Publication record in top-tier ML venues (NeurIPS, ICML, ICLR) focused on architecture research.
Experience with model scaling laws and efficiency-performance tradeoffs.
Background in hybrid architectures combining attention with alternative sequence modeling approaches.
Familiarity with training stability techniques for large-scale pre-training runs.
Benefits
Competitive base salary
Performance-based bonus aligned with research and model milestones
Senior Research Assistant conducting research in eye regeneration using molecular tools and model organisms. The role involves CRISPRCas technology and various biological techniques.
Senior Researcher in the Greater London Authority providing policy support through research and drafting reports. Collaborating with Labour Assembly Members on housing and environment portfolios at City Hall.
Post - doctoral Research Fellow conducting research on data analytics methods for bipolar disorders. Part of the University of Edinburgh’s Centre for Medical Informatics team.
Assisting in the creation of training materials for legal topics and conducting training sessions. Collaborating with experienced colleagues to enhance training contents at KINAST Innovations in a part - time role.
Research Scientist at DeepL conducting research on neural translation networks and developing AI technologies. Collaborate with a diverse, international team in a hybrid work model.
Senior Applied Scientist designing and delivering AI solutions to enhance products at Thomson Reuters. Leading model development and collaborating with stakeholders in agile environments.
Student Research Assistant assisting Faculty with literature reviews and skill development training at Bow Valley College. Requires current enrollment as a student at Bow Valley College.
Assistente de Pesquisa & Inovação em empresa química colaborando com equipes e realizando análises. Focados em inovação e preservação do planeta com soluções químicas.
Senior Research Scientist leading validity research studies for the College Board's assessments. Spearheading statistical analysis and designing educational research projects to enhance college readiness.
Research Scientist specializing in NLP and deep learning at NVIDIA. Propose innovative model architectures and publish research findings while collaborating with diverse teams.