Principal ML Engineer

SOLVENTUM
4dRemote

About The Position

At Solventum, we enable better, smarter, safer healthcare to improve lives. As a new company with a long legacy of creating breakthrough solutions for our customers’ toughest challenges, we pioneer game-changing innovations at the intersection of health, material and data science that change patients' lives for the better while enabling healthcare professionals to perform at their best. Because people, and their wellbeing, are at the heart of every scientific advancement we pursue. We partner closely with the brightest minds in healthcare to ensure that every solution we create melds the latest technology with compassion and empathy. Because at Solventum, we never stop solving for you. As a Principal ML Engineer, you will lead the technical architecture and engineering strategy for integrating sophisticated AI into high-stakes Healthcare Information Systems (HIS). We are looking for a seasoned builder who prioritizes reliability, system performance, and automated scalability over hype. While many focus on the "science" of modeling, your mission is the engineering of the ecosystem. You will architect the robust MLOps pipelines and cloud infrastructure required to move models from experimental notebooks into mission-critical clinical environments. You are the bridge between raw data and resilient, production-grade AI services.

Requirements

  • Bachelors Degree or Higher in Computer Science, Software Engineering, or a related technical field.
  • 10+ years of experience in software engineering, with at least 6 years dedicated to deploying and maintaining large-scale ML systems in production (not just research or POCs).
  • MLOps & Cloud: Expert-level experience with Cloud Providers (AWS/GCP/Azure) and orchestration tools (Kubernetes, Kubeflow, or Airflow).
  • Engineering & Programming: Expert-level Python and Java/Go (or similar). Deep proficiency in backend frameworks and system design patterns.
  • Data Engineering: Strong experience with Spark, Snowflake/Databricks, and building scalable feature stores.
  • Applied AI: Hands-on experience deploying Generative AI (LLMs) and Agentic frameworks (LangChain/LangGraph) within a containerized microservices architecture.
  • Must be legally authorized to work in country of employment without sponsorship for employment visa status (e.g., H1B status).

Nice To Haves

  • Hardware Optimization: Experience with GPU optimization, quantization, or specialized serving frameworks (vLLM, TGI).
  • Master’s or PhD in Computer Science, Software Engineering, or a related technical field is preferred
  • Security & Compliance: Deep understanding of cybersecurity best practices within regulated industries (Healthcare, Finance, or Defense).
  • Distributed Systems: Proven ability to design systems that handle massive concurrency and distributed data processing.

Responsibilities

  • MLOps & System Architecture Production Lifecycle: Lead the design and implementation of end-to-end ML lifecycles, focusing on automated CI/CD pipelines, model versioning (MLflow/DVC), and reproducible experimentation.
  • Inference at Scale: Architect high-performance serving layers for both LLMs and classical models, ensuring low-latency and high-availability in a secure healthcare cloud environment.
  • Agentic Orchestration: Build the underlying infrastructure for agent-based reasoning systems, ensuring these "Agentic" workflows are traceable, auditable, and integrated into existing HIS.
  • Data Engineering & Infrastructure Data Reliability: Design robust data pipelines (ETL/ELT) to process healthcare-specific formats (FHIR, HL7, DICOM) into high-quality features for real-time and batch inference.
  • Hybrid Infrastructure: Manage and optimize cloud-native infrastructure (AWS/Azure/GCP) using Infrastructure as Code (Terraform/Pulumi) to support heavy compute workloads.
  • System Integrity: Implement comprehensive monitoring and observability frameworks to detect data drift, model decay, and system bottlenecks before they impact clinical outcomes.
  • Technical Leadership & Governance Engineering Authority: Serve as the lead architect for the ML platform, ensuring all systems are HIPAA/HITRUST compliant and follow "security-by-design" principles.
  • Operational Excellence: Establish rigorous standards for code quality, containerization (Docker/Kubernetes), and system documentation across the engineering organization.
  • Strategic Mentorship: Elevate the team by fostering a culture of "ML as Engineering," guiding junior engineers in building maintainable, modular, and scalable software.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service