Principal Data Scientist

Caterpillar Inc.Aurora, CO
2d

About The Position

Your Work Shapes the World at Caterpillar Inc. When you join Caterpillar, you're joining a global team who cares not just about the work we do – but also about each other. We are the makers, problem solvers, and future world builders who are creating stronger, more sustainable communities. We don't just talk about progress and innovation here – we make it happen, with our customers, where we work and live. Together, we are building a better world, so we can all enjoy living in it. Your Work Shapes the World at Caterpillar Inc. Cat Digital is the digital and technology arm of Caterpillar Inc., leveraging the latest technologies to build industry leading digital solutions for our customers and dealers. With over 1.5 million connected assets worldwide, our teams use data, technology, advanced analytics, telematics, and AI capabilities to help our customers build a better, more sustainable world. Job Summary: eCommerce is a key digital enabler to Caterpillar’s aftermarket parts and services growth strategy. Delivering on the Caterpillar brand promise of premium, high-quality solutions is an important element in accelerating the development and deployment of Caterpillar’s expanded capabilities in eCommerce. The principal data scientist – Search Quality Framework is responsible for architecting and implementing an advanced, AI-driven search quality framework that rigorously validates whether search enhancements deliver the intended outcomes. This role leverages sophisticated data science skills, including designing and deploying machine learning (ML) and deep learning models tailored for search relevance, ranking, and personalization. Key responsibilities involve developing automated evaluation pipelines using statistical analysis, A/B testing, and custom metrics to measure the impact of new algorithms on user experience and business goals This role combines strategic vision, hands-on AI/ML expertise, and leadership to build scalable, high-performance search algorithms that deliver exceptional user experiences.

Requirements

  • Search AI/ML Models: Strong track record to deploy Search related ML models in large industrial /automobile / Manufacturing parts application (Learning-Rank, Lambda MART, Deep Ranking Models)
  • Search & Data Quality Frameworks: Extensive background in building frameworks that continuously monitor and improve search accuracy.
  • Generative AI & LLMs: Proficiency in Fine-tuning and Prompt Engineering for Large Language Models, specifically using Retrieval-Augmented Generation (RAG), Indexing models BM25, Sematic Retrieval, Query rewrite etc
  • ML Platform Experience: Proven ability to work with large-scale search logs, Product data and build robust future/label pipelines and deploy models thru MLOps/ML platforms and API’s in Cloud environment (AWS/Azure/GCP)
  • Business Statistics: Extensive experience with statistical tools, processes, and practices to describe business results in measurable scales; ability to use statistical tools and processes to assist in making business decisions.
  • Analytical Thinking: Extensive knowledge of techniques and tools that promote effective analysis; ability to determine the root cause of organizational problems and create alternative solutions that resolve these problems.
  • Programming Languages: Extensive knowledge of basic concepts and capabilities of applying Python programming to solve business challenges; ability to use tools, techniques and platforms to write and modify programming languages.
  • Requirements Analysis: Working knowledge of tools, methods, and techniques of requirement analysis; ability to elicit, analyze and record required business functionality and non-functionality requirements to ensure the success of a system or software development project.

Nice To Haves

  • Typically, a Bachelors, Masters, or PhD degree in Applied Statistics, Data Science, Business Analytics, Predictive Analytics, Business Intelligence & Analytics, Mathematics, Computer Science, Engineering (Aerospace, Electrical, Mechanical, Computer, Industrial, Agricultural, etc.), or equivalent technical degree
  • Extensive experience applying Python (NumPy, SciPy, pandas, etc.) programming to solve business challenges.
  • Extensive experience with advanced data analysis, machine learning such as clustering, Log regressions, neural nets and statistical methods such as statistical process control, etc. (typically 8+ years)
  • Experience in practical applications of onboard architecture / software (e.g. mini projects using Raspberry Pi or any other architecture is a bonus)
  • Working experience with heavy equipment engineering or data analysis.
  • Familiarity with A/B testing frameworks for evaluating and improving model-driven features - nice to have.
  • Working knowledge with cloud technologies (AWS, Azure, Google Cloud, etc.)
  • Advanced experience with version control / repositories such as GitHub
  • Experience operating in an Agile environment
  • Must demonstrate strong initiative, interpersonal skills, and the ability to communicate effectively.

Responsibilities

  • Technical Strategy: Define and implement a long-term technical vision for the search platform to ensure scalability and adaptability to growing data volumes and query complexity.
  • Team Leadership: Mentor and guide a team of search engineers through technical reviews, best practices, and collaborative problem-solving.
  • Feature Development: Introduce advanced capabilities such as NLP, vector search, and personalization to enhance relevance and accuracy.
  • Data Analysis & Optimization: Build search capabilities with measurable KPIs (e.g., CTR, Query Distribution, Zero Search) and leverage analytics to continuously improve search performance.
  • Cross-Functional Collaboration: Partner with product managers, data scientists, and engineering teams to align search initiatives with business objectives
  • Algorithm Development & Modeling Optimization Models: Profile and tune deep learning algorithms for maximum search efficiency for keyword matching, user data and behavior, preferences, popularity, and more.
  • Behavioral Models: Profile the end-users’ behaviors and signals and fine-tune models to reflect to rearrange the search facet values
  • Context Models: Leverage AI/MI/LLM models to discern user intent and capture in relevant search categories
  • Categorization Models : Leverage the Graph based models to build fitment recommendation based on Bill of materials.
  • Personalization Models: Rule based segmentation, ML based recommendation models, PFM – SSL models and Implicit Personalization models to enhance the search

Benefits

  • Medical, dental, and vision benefits
  • Paid time off plan (Vacation, Holidays, Volunteer, etc.)
  • 401(k) savings plans
  • Health Savings Account (HSA)
  • Flexible Spending Accounts (FSAs)
  • Health Lifestyle Programs
  • Employee Assistance Program
  • Voluntary Benefits and Employee Discounts
  • Career Development
  • Incentive bonus
  • Disability benefits
  • Life Insurance
  • Parental leave
  • Adoption benefits
  • Tuition Reimbursement
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service