Sr Associate Data Scientist (Databricks Platform)

McKessonOverland Park, KS
2dHybrid

About The Position

McKesson is an impact-driven, Fortune 10 company that touches virtually every aspect of healthcare. We are known for delivering insights, products, and services that make quality care more accessible and affordable. Here, we focus on the health, happiness, and well-being of you and those we serve – we care. What you do at McKesson matters. We foster a culture where you can grow, make an impact, and are empowered to bring new ideas. Together, we thrive as we shape the future of health for patients, our communities, and our people. If you want to be part of tomorrow’s health today, we want to hear from you. Rx Savings Solutions (RxSS), part of McKesson's CoverMyMeds organization, offers an innovative, patented software system that educates and empowers consumers to make the best healthcare choices at the lowest cost. Founded and operated by a team of pharmacists and software engineers, we support a collaborative, cost-saving solution for purchasing prescription drugs. We currently have an opportunity for a Sr Associate Data Scientist to join our engineering team! The Sr Associate Data Scientist is a junior-level position on our team that contributes to the development, deployment, and optimization of machine‑learning models and analytical solutions within the RxSS Data & AI ecosystem. This role leverages the Databricks platform— spanning Delta Lake, Unity Catalog, MLflow, and associated cloud services — to support model lifecycle activities, data engineering collaboration, and applied data science initiatives across the business. The Sr Associate Data Scientist is responsible for hands‑on model experimentation, feature engineering, and data preparation in partnership with Data Engineering, Product, and Analytics teams. Our ideal candidate will reside in the Columbus, OH or Overland Park, KS areas to support a hybrid work scenario, but we me may consider a well-qualified, fully remote candidate. At this time, we are not able to offer sponsorship for employment visas. This includes individuals currently on F-1 OPT, STEM OPT, or any other visa status that would require future sponsorship. Candidates must be authorized to work in the United States on a permanent basis without the need for current or future sponsorship.

Requirements

  • Bachelor’s degree in Data Science, Computer Science, Analytics, Math, Statistics, Engineering, or related field, or related experience
  • Typically requires 2+ years of experience in applied ML, data science, or advanced analytics
  • Hands-on experience with Python, PySpark, SQL, and Git-based workflows
  • Practical exposure to cloud-based ML environments (preferably Databricks)
  • Understanding of ML techniques such as regression, classification, clustering, time-series forecasting, and embeddings
  • Ability to work with large, complex datasets

Nice To Haves

  • Experience with Databricks MLflow, model serving, and workflow orchestration
  • Familiarity with Delta Lake storage formats, feature engineering at scale, and medallion architecture patterns
  • Experience deploying models into production environments with monitoring and observability

Responsibilities

  • Machine Learning & Model Development Build, train, evaluate, and optimize machine‑learning models using Spark MLlib, Python, and cloud‑based toolchains
  • Perform exploratory data analysis (EDA), statistical profiling, and feature engineering on large-scale datasets hosted in Databricks
  • Implement and manage MLflow experiment tracking, model registry, versioning, and reproducibility workflows
  • Contribute to model monitoring, performance tuning, drift detection, and continuous improvement.
  • Databricks Cloud Engineering Develop notebooks, jobs, and workflows within Databricks for data preparation, model training, and batch/streaming inference
  • Utilize Unity Catalog for secure, governed data access, lineage, and metadata management
  • Work with Delta Lake (bronze/silver/gold layers) for scalable feature pipelines supporting both training and production
  • Collaborate with Engineering to migrate workloads to Databricks and support transformations, optimizations, and cost‑efficient compute usage.
  • Data & Feature Engineering Build reusable, production‑grade feature pipelines in PySpark and SQL
  • Implement data validation, quality checks, and transformation logic consistent with enterprise guidelines
  • Participate in design sessions for ingestion, medallion architecture workflows, and schema evolution
  • Collaboration & Cross‑Functional Support Partner with Data Engineering, Analytics, Product, and SMEs to translate business problems into data‑driven solutions
  • Document model assumptions, data transformations, evaluation metrics, and deployment patterns
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service