Machine Learning Engineer - Foundation Models for Biology

Prima MenteSan Francisco, CA
1dOnsite

About The Position

Prima Mente is a frontier biology AI lab. We generate our own data, build general purpose biological foundation models, and translate discoveries into research and clinical outcomes. Our first goal is to tackle the brain: to deeply understand it, protect it from neurological disease, and enhance it in health. Our team of AI researchers, experimentalists, clinicians, and operators is based in London, San Francisco and Dubai. Role focus - Foundation Models for Biology You will play a pivotal role in designing, implementing, and scaling foundational AI models and infrastructure for multi-omics at massive scale. Your work will directly drive breakthroughs in scientific understanding and contribute to transformative applications in medicine and biology.

Requirements

  • Deep understanding of state-of-the-art machine learning methodologies and proven experience in translating them into practical solutions.
  • Solid foundation in distributed computing principles, parallel processing, and algorithmic efficiency.
  • Experience optimizing ML algorithms for performance, memory efficiency, and compute resource utilization.
  • Deep expertise in modern ML frameworks and tools (e.g., PyTorch, JAX, TensorFlow)
  • Familiarity with state-of-the-art training, optimizing, and deploying large-scale models (7B+ parameters) and inference workflows.
  • Skilled in designing and implementing scalable data pipelines capable of rapid ingestion, transformation, and processing.
  • Skilled in clearly articulating complex ideas, effectively communicating why particular approaches succeed or fail, and providing insightful critical analyses.
  • Low level algorithm optimisation
  • quantization (8bit or lower)
  • JIT compilation
  • XLA/Mosaic/Triton/CUDA
  • Hardware optimisation (GPU/TPU/HPU)
  • Finetuning Optimization (QLora, QDora)
  • Large scale data above 2T tokens

Responsibilities

  • Implement high-performance ML algorithms optimised for massive-scale, ensuring reliability, efficiency, and scalability.
  • Design, develop, and maintain robust experimentation pipelines enabling rapid iteration, precise evaluations, and reproducible research outcomes
  • Refactor and scale prototype research code into clean, maintainable, and performant repositories suitable for production-grade deployments.
  • Create high-speed data processing workflows capable of efficiently handling large-scale datasets to accelerate experimentation and model development.
  • Experimental design, prioritising high impact experiments with the highest signal:noise ratio.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service