Hybrid Senior Director of AI Infrastructure, Engineering

Posted 3 weeks ago

Apply now

About the role

  • Working with the AI Research Scientists, iterate on, optimize, deploy, and maintain innovative machine learning models, systems, and software tools that enable the analysis and interpretation of AI models for Biology.
  • Work with cross-functional team members to quickly iterate on system performance to meet/stay ahead of users’ needs - e.g. we get feedback that the model doesn't scale to X million so working with our user researcher/scientist/product team to iterate on the solution.
  • Partner with research scientists to build robust data loader pipelines for scalable distributed training and evaluation.
  • Serve as an interface to product and engineering teams to understand how models may need to evolve to support multiple use cases.
  • Develop model evaluation and interpretability frameworks that help biologists understand which data features drive model predictions.
  • Build reusable engineering utilities that can unlock experimentation velocity across research initiatives in the organization.
  • Optimize model architectures to enhance performance, fine-tune accuracy, and efficiently manage infrastructure resources.

Requirements

  • Experience in working with a highly interactive and cross-functional collaborative environment with a diverse team of colleagues and partners solving complex problems through applied deep learning.
  • A track record and expertise in developing deep learning models on large-scale GPU clusters, using techniques of distributing training such as DDP, FSDP, Model parallelism, low-precision training, profiling and optimizing AI/ML code, fine tuning models.
  • Expertise in leading end-to-end experimentation pipelines for training and evaluating deep learning models, with particular focus on experiment tracking and reproducibility.
  • A good working knowledge of Python-based ML libraries and frameworks such as PyTorch, JAX, TensorFlow, NumPy, Pandas, and Scikit-learn.
  • Experience in using modern frameworks for distributed computing and infrastructure management, particularly as related to ML models such as PyTorch Lightning, Deepspeed, TransformerEngine, RayScale etc.
  • Ability to effectively balance exploratory research with robust engineering practices.
  • A good working knowledge of general software engineering practices in a production environment.
  • The ability to work independently and as part of a team, and have excellent communication and interpersonal skills.
  • Have a Masters in computer science with a focus on machine learning & data analytics, or equivalent industry experience and at least 6-8 years of experience developing and applying machine learning methods.

Benefits

  • CZI provides a generous employer match on employee 401(k) contributions to support planning for the future.
  • Annual benefit for employees that can be used most meaningfully for them and their families, such as housing, student loan repayment, childcare, commuter costs, or other life needs.
  • CZI Life of Service Gifts are awarded to employees to “live the mission” and support the causes closest to them.
  • Paid time off to volunteer at an organization of your choice.
  • Funding for select family-forming benefits.
  • Relocation support for employees who need assistance moving to the Bay Area.
  • And more!

Job title

Senior Director of AI Infrastructure, Engineering

Job type

Experience level

Senior

Salary

$241,000 - $331,000 per year

Degree requirement

Postgraduate Degree

Location requirements

Report this job

See something inaccurate? Let us know and we'll update the listing.

Report job