Data Engineer – Kafka, Snowflake, Google Dataflow (Streaming)

CrackaJack Digital Solutions LLCRenton, WA
1d

About The Position

The Data Engineer will play a critical role in building scalable, reliable data pipelines to support real-time and batch processing workflows. You will work closely with cross-functional teams to integrate multiple data sources, build Operational Data Stores, ,transformations and enable timely data availability for reporting and analytics through dashboards.

Requirements

  • Event Streaming: Confluent Kafka (proficiency), Kafka Connectors
  • API Management: Apigee(proficiency)
  • Cloud Storage & Data Warehousing: AWS S3, Snowflake
  • Data Processing: Google Dataflow
  • Programming: SQL, Python (proficiency)
  • Batch & Real-Time Pipeline Development
  • Data Visualization Support: Tableau (basic understanding for data publishing)
  • Experience building Operational Data Stores (ODS) and data transformation pipelines in Snowflake
  • Bachelor’s or Master’s degree in Computer Science, Data Engineering, Information Systems, or related field.
  • 3+ years of proven experience in data engineering, especially with streaming and batch data pipelines.
  • Hands-on experience with Kafka ecosystem (Confluent Kafka, Connectors) and cloud data platforms (Snowflake, AWS).
  • Skilled in Python programming for data processing and automation.
  • Strong understanding of data modeling, ETL/ELT processes, and data quality principles.
  • Ability to work collaboratively in cross-functional teams and communicate technical concepts effectively.

Nice To Haves

  • Experience with Google Cloud Platform services, especially Google Dataflow, is highly desirable.
  • Familiarity with truck industry aftersales or automotive service and repair data is a plus

Responsibilities

  • Data Ingestion & Integration
  • Develop and maintain data ingestion pipelines for service and repair data using Confluent Kafka for event streaming.
  • Implement connectors and integrations between Kafka, AWS S3, Google Dataflow, and Snowflake to facilitate batch and real-time data flows.
  • Work with APIs and Apigee to securely ingest and distribute data across internal and external systems, including dealer networks.
  • Data Cleansing & Transformation
  • Build and optimize data cleansing, normalization, and transformation pipelines in Google Dataflow for real-time processing.
  • Design and implement batch transformation jobs within Snowflake, building and maintaining the Operational Data Store (ODS).
  • Ensure data quality, consistency, and integrity across all processing stages.
  • Data Publishing & Reporting Support
  • Publish transformed and aggregated data to internal and external dashboards using APIs, Kafka topics, and Tableau.
  • Collaborate with data analysts and business stakeholders to support reporting and analytics requirements.
  • Monitor and troubleshoot data pipelines to ensure high availability and performance.
  • Collaboration & Documentation
  • Partner with data architects, analysts, and external dealer teams to understand data requirements and source systems.
  • Document data workflows, processing logic, and integration specifications.
  • Adhere to best practices in data security, governance, and compliance.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service