Data Engineer – Kafka, Snowflake, Google Dataflow (Streaming)

CrackaJack Digital Solutions LLC•Renton, WA

About The Position

The Data Engineer will play a critical role in building scalable, reliable data pipelines to support real-time and batch processing workflows. You will work closely with cross-functional teams to integrate multiple data sources, build Operational Data Stores, ,transformations and enable timely data availability for reporting and analytics through dashboards.

Requirements

Event Streaming: Confluent Kafka (proficiency), Kafka Connectors
API Management: Apigee(proficiency)
Cloud Storage & Data Warehousing: AWS S3, Snowflake
Data Processing: Google Dataflow
Programming: SQL, Python (proficiency)
Batch & Real-Time Pipeline Development
Data Visualization Support: Tableau (basic understanding for data publishing)
Experience building Operational Data Stores (ODS) and data transformation pipelines in Snowflake
Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or related field.
3+ years of proven experience in data engineering, especially with streaming and batch data pipelines.
Hands-on experience with Kafka ecosystem (Confluent Kafka, Connectors) and cloud data platforms (Snowflake, AWS).
Skilled in Python programming for data processing and automation.
Strong understanding of data modeling, ETL/ELT processes, and data quality principles.
Ability to work collaboratively in cross-functional teams and communicate technical concepts effectively.

Nice To Haves

Experience with Google Cloud Platform services, especially Google Dataflow, is highly desirable.
Familiarity with truck industry aftersales or automotive service and repair data is a plus

Responsibilities

Data Ingestion & Integration
Develop and maintain data ingestion pipelines for service and repair data using Confluent Kafka for event streaming.
Implement connectors and integrations between Kafka, AWS S3, Google Dataflow, and Snowflake to facilitate batch and real-time data flows.
Work with APIs and Apigee to securely ingest and distribute data across internal and external systems, including dealer networks.
Data Cleansing & Transformation
Build and optimize data cleansing, normalization, and transformation pipelines in Google Dataflow for real-time processing.
Design and implement batch transformation jobs within Snowflake, building and maintaining the Operational Data Store (ODS).
Ensure data quality, consistency, and integrity across all processing stages.
Data Publishing & Reporting Support
Publish transformed and aggregated data to internal and external dashboards using APIs, Kafka topics, and Tableau.
Collaborate with data analysts and business stakeholders to support reporting and analytics requirements.
Monitor and troubleshoot data pipelines to ensure high availability and performance.
Collaboration & Documentation
Partner with data architects, analysts, and external dealer teams to understand data requirements and source systems.
Document data workflows, processing logic, and integration specifications.
Adhere to best practices in data security, governance, and compliance.