Machine Learning Scientist, Oligo Research Intern

Sarepta Therapeutics
2d$21 - $26Hybrid

About The Position

Why Sarepta? Why Now? The promise of genetic medicine has arrived, and Sarepta is at the forefront. We hold a leadership position in Duchenne muscular dystrophy (Duchenne) and are building a robust portfolio of programs across muscle, central nervous system, and cardiac diseases. In 2023, we launched our fourth therapy and the first ever gene therapy to treat Duchenne. We’re looking for people who see unlimited potential in themselves and who are motivated by an unwavering commitment to patients. What Sarepta Offers At Sarepta, we care deeply about all the people in our community and believe in the importance of supporting them in all aspects of their lives. We aspire to maintain a culture that acknowledges people bring their whole selves to work, and we will strive to help everyone in our community integrate their work and personal lives while maintaining productivity. The Importance of the Role Sarepta is expanding its capabilities in antisense oligonucleotide (ASO) therapeutics. As part of this initiative, we are integrating advanced machine learning (ML), artificial intelligence (AI), and data-driven modeling to accelerate ASO discovery and optimization. The Machine Learning Scientist, Oligo Research Intern will assist in developing a reproducible computational framework that enhances Sarepta’s AI/ML-assisted design pipeline for next-generation ASO therapeutics. This role offers deep exposure to the drug development lifecycle, encompassing everything from data curation to model validation and tool deployment. The intern will gain hands-on experience in shaping the computational strategy behind future ASO designs. The Opportunity to Make a Difference In this internship, you will focus on developing sequence-aware predictive models to prioritize PMOs (phosphorodiamidate morpholino oligomers) based on their expected exon-skipping response. Additionally, the modeling framework will be extended to include PMO gapmers, siRNA–PMO hybrids, and ranking strategies for conjugate designs, significantly broadening the cross-modality design capabilities within Sarepta.

Requirements

  • Current undergraduate or master's student (preferred) in Computational Chemistry/Biology, Machine Learning, Biomedical/Chemical Engineering, or a related field.
  • A background in oligonucleotide design and characterization is strongly preferred.
  • Experience in developing and/or adopting probabilistic learning or deep learning models, including Recurrent Neural Networks (RNNs), Graph Neural Networks (GNNs), Transformers, Natural Language Processing models, and Generative AI.
  • Programming and scripting skills in languages such as Python, R, and SQL, with hands-on experience using modern deep learning frameworks such as PyTorch, Tensorflow, skikit-learn or JAX.
  • Familiar working with large-scale computing and cloud infrastructures, database systems, and development tools in a production environment.
  • Familiar with data development, tools, and infrastructure: AWS, database technologies, GitHub, GitLab, and Docker containers
  • Ability to effectively communicate and collaborate with a multidisciplinary team, including chemists, biologists, and data scientists to successfully complete scientific projects.
  • Strong team player with a commitment to continuous learning.

Nice To Haves

  • Experience in developing machine learning models for DNA, RNA, and proteins, including language models, structure prediction, and design, is a plus.

Responsibilities

  • Develop and implement sequence‑aware machine learning models to prioritize ASO designs by predicted exon‑skipping response across multiple targets.
  • Build a reproducible computational framework including data ingestion, feature engineering, model training, validation, and deployment for oligonucleotide design.
  • Extend modeling and evaluation framework to PMO-gapmers, siRNA PMO hybrids, and conjugate designs to broaden cross‑modality capabilities.
  • Curate and harmonize internal and external literature curated datasets and define robust sequence and structure features such as thermodynamics, accessibility, sequence motifs, secondary structure, and additional context that drive model performance.
  • Establish benchmarks and prospective tests to assess accuracy, robustness, and scalability, and partner with experimental teams to validate predictions.
  • Evaluate and adopt proprietary and open source tools to enhance modeling workflows and accelerate decision support.
  • Maintain a clean and well‑documented codebase, and user guidance for cross‑functional teams.
  • Perform additional related tasks as assigned.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service