Data Engineer II

Samsara•Houston, TX

13h•Remote

About The Position

About the role: Samsara’s Revenue Operations AI & Data Team is building the future of how we go to market — with intelligence, personalization, and speed. We’re a high-impact team of builders, scientists, and strategists focused on transforming sales operations through AI. Our mission is to help sellers reach the right customer at the right time with the right message — and to put everything they need at their fingertips, whether that’s data from Salesforce, context from a past call, or content that wins deals. As a Data Engineer II , you’ll own the data platforms that power Samsara’s GTM AI engine. You’ll be responsible for building, scaling, and optimizing our Databricks data store, visualization store, and AI store, while also enabling large-scale generative AI jobs in Databricks. Your work will ensure that our AI applications are grounded in clean, reliable, and well-structured data from CRM pipelines, CS Systems to GenAI-powered copilots. You’ll partner closely with data scientists, AI engineers, and business stakeholders to deliver the infrastructure that fuels innovation at scale. This role is open to candidates residing in the US except the San Francisco Bay Metro Area, NYC Metro Area, and Washington, D.C. Metro Area. You should apply if: You want to impact the industries that run our world: Your efforts will result in real-world impact—helping to keep the lights on, get food into grocery stores, reduce emissions, and most importantly, ensure workers return home safely. You are the architect of your own career: If you put in the work, this role won’t be your last at Samsara. We set up our employees for success and have built a culture that encourages rapid career development, countless opportunities to experiment and master your craft in a hyper growth environment. You’re energized by our opportunity: The vision we have to digitize large sectors of the global economy requires your full focus and best efforts to bring forth creative, ambitious ideas for our customers. You want to be with the best: At Samsara, we win together, celebrate together and support each other. You will be surrounded by a high-calibre team that will encourage you to do your best.

Requirements

2-3 years of industry experience in data engineering, with significant experience building large-scale data platforms.
Hands-on experience working with modern data technologies stack, such as Databricks, DBT, Redshift, RDS, Snowflake or similar solutions.
Proficiency in Python and SQL, with experience in designing robust ETL/ELT pipelines.
Experience orchestrating data workflows at scale and enabling machine learning or AI use cases.
Strong understanding of data modeling, performance optimization, and cost-efficient infrastructure design.
Located in and authorized to work in the United States (this is a fully remote role).

Nice To Haves

Experience enabling generative AI workflows in Databricks or similar platforms.
Familiarity with vector databases, embeddings, and retrieval systems.
Experience with Salesforce, Gainsight, Gong, Outreach, or other CRM/enablement tools as data sources.
Proven ability to automate repetitive tasks, improve data hygiene, and enable experimentation across GTM data use cases aligning with the emerging responsibilities of GTM engineering where clean, reliable GTM data foundations enable high-leverage automation and insight generation
Exposure to observability, monitoring, and governance best practices for data and AI systems.
Ability to collaborate closely with AI/ML teams while driving technical excellence in data engineering.

Responsibilities

Build and maintain ETL/ELT data pipelines in Databricks and Spark, ensuring data is ingested, transformed, and delivered reliably for analytics and AI use cases.
Develop and evolve logical and physical data models to support reporting, experimentation, and advanced workflows (e.g., scoring models, signal generation).
Implement monitoring, alerts, and testing for data quality, timeliness, and lineage to ensure trustworthy data delivery.
Support workflow orchestration with Databricks Jobs, DBT, or equivalent scheduling tools to operate at scale.
Contribute to data pipelines and tooling that support retrieval-augmented generation (RAG), vector integrations, or embedding workflows.
Design and optimize bulk GenAI data pipelines in Databricks to support generative AI applications at scale.
Partner with AI engineers and data scientists to enable experimentation, model training, and production-grade deployments.
Develop frameworks for data ingestion, transformation, governance, and monitoring across CRM, sales, and revenue systems.
Work with RevOps, sales, and customer success stakeholders to translate business needs into data requirements and stable technical implementations.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume