Senior Machine Learning Engineer – Model Training and Customization

Red River•Boston, MA

14d

About The Position

Come be a part of Red Hat's charge to democratize AI with open source! Red Hat's Global Engineering Team is looking for a Senior Machine Learning Engineer to join our newly formed AI Engineering organization. This role will be located within the AI Innovation team, which conducts customer- and science-driven research to drive innovation for Red Hat's customers. The team focuses on a pattern of "research → open-source software → product" as the way we operate our engineering work. This role will be focused on building the core logic and enhancements for our model fine-tuning and post-training libraries. In this role, you will work directly with research scientists and open source AI communities to build and improve implementations of novel training methods, ranging from SFT, continual learning, and offline preference tuning to online reinforcement learning methods like GRPO and RLHF. You will develop working relationships across multiple teams, contributing to both upstream open source projects and our internal Training Hub. The ideal candidate will be a highly collaborative individual with a passion for working on complex ML projects in an open organization where contributions are valued and expected from all levels. As this is a fast-moving area of opportunity for Red Hat, the ability to communicate productively and effectively with team members, stakeholders, and Red Hat leadership is critical. Success in this role would be delivering robust, scalable training libraries that bridge cutting-edge research with production needs. This position reports directly to the Manager of AI Innovation. This position may require occasional travel to partner collaboratively in our Boston, MA office multiple times per quarter. Successful applicants must reside in a state where Red Hat is registered to do business.

Requirements

Bachelor's degree in computer science or equivalent.
3+ years of experience in Python development.
Significant background in AI/ML projects or coursework (neural networks, deep learning, language models, reinforcement learning).
Experience in research engineering, machine learning engineering, or applied ML roles.
Strong experience with common model architecture development and adapter frameworks (e.g. PyTorch, Transformers, PEFT, etc.).
Familiarity with distributed training frameworks (e.g. FSDP, DeepSpeed) and inference runtimes (e.g. vLLM).
Experience in open-source projects and collaborative development workflows.
Existing background in software development or engineering, building robust and consumable libraries and implementations.
Experience with unit testing, integration testing, and performance testing.
Strong self-motivation and organizational skills.
Excellent written and verbal communication skills.
Positive attitude and willingness to share ideas openly.

Nice To Haves

Masters or PhD in Machine Learning (ML) / Natural Language Processing (NLP).
Experience with MLOps and deployment systems (e.g., Kubeflow, MLflow, Kubernetes, CI/CD pipelines).
Experience writing functional, end-to-end or coverage tests in Python.
Experience with GitHub Actions, GitHub automation, or CI/CD practices.
Experience reading/writing, publishing, and/or implementing research papers.
Experience in Red Hat products.
Experience in large language models.

Responsibilities

Develop core libraries for various model post-training methods and innovations.
Work directly on upstream, open source projects and engage with community needs and contributions.
Contribute to core post-training algorithm research and engineering, introducing new methods both to community efforts and our own Training Hub.
Understand and adapt novel architectures and techniques to work with various post-training algorithms, across distributed training frameworks.
Optimize, enhance, and improve robustness and usability of both existing and in-flight projects, working closely with researchers to validate prototype logic.
Maintain and expand library feature pool, and address core algorithm bugs and blockers.
Work closely with software engineers on interface and testing designs.
Participate in code reviews and collaborate on best practices within the engineering team.
Document system designs, processes, and model performance for transparency and future reference.
Report on project status, challenges, and results to stakeholders.

Benefits

Comprehensive medical, dental, and vision coverage
Flexible Spending Account - healthcare and dependent care
Health Savings Account - high deductible medical plan
Retirement 401(k) with employer match
Paid time off and holidays
Paid parental leave plans for all new parents
Leave benefits including disability, paid family medical leave, and paid military leave
Additional benefits including employee stock purchase plan, family planning reimbursement, tuition reimbursement, transportation expense account, employee assistance program, and more!

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume