Lead Data Engineer

HoneywellAtlanta, GA
4d

About The Position

Honeywell is accelerating its transformation from industrial automation to full autonomy and the data that powers that future starts here. As a Lead Data Engineer on our Industrial AI & Data Platforms team, you will architect and own the data foundations that enable physical AI at scale: from terabytes of IoT sensor telemetry streaming through our medallion lakehouse to production-grade Generative AI pipelines that deliver actionable intelligence across Honeywell's global industrial operations. You will serve as a technical anchor for a team of engineers, shaping architecture decisions, building cutting edge AI solutions, raising engineering standards, and directly building the AI-ready data products that Honeywell's autonomy mission depends on. This role sits at the intersection of modern data engineering and applied GenAI , if you are energized by building systems that have real-world industrial impact, this is your role.

Requirements

  • 8+ years of data engineering experience with at least 2 years in a lead or senior role, demonstrating progression in technical complexity and team leadership.
  • Hands-on experience building and operating medallion lakehouse architectures (Bronze / Silver / Gold).
  • Deep expertise in Apache Spark / PySpark with production experience on Azure Databricks at scale.
  • Strong proficiency with streaming platforms - Apache Kafka and/or Azure Event Hub for real-time IoT data.
  • Cloud data architecture skills (Azure preferred; AWS/GCP a plus) with experience designing scalable, cost-effective data lakes and warehouses using cloud-native services.
  • Data modeling and schema design expertise for both transactional and analytical workloads, including dimensional modeling and data vault methodologies.
  • Proven experience building data pipelines for GenAI or ML applications: RAG systems, embedding pipelines, and document ingestion.
  • MLOps familiarity including model versioning, feature stores, and monitoring/observability for data and ML systems.
  • Demonstrated ability to lead technical design reviews, mentor engineers, and drive architectural decisions with stakeholder buy-in.
  • Proficiency in CI/CD using GitHub Actions for automating data pipeline deployments.
  • US PERSON REQUIREMENTS: Due to compliance with U.S. export control laws and regulations, candidate must be a U.S. Person which is defined as a U.S. citizen, a U.S. permanent resident, or have protected status In the U.S. under asylum or refugee status or have the ability to obtain an export authorization.

Nice To Haves

  • Experience with LangChain, LangGraph, or other agentic AI orchestration frameworks.
  • Expertise in real-time data processing frameworks (Apache Spark Streaming, Structured Streaming)
  • Knowledge of MLOps practices and experience building data pipelines for AI model deployment
  • Experience with time-series databases and IoT data modeling patterns
  • Familiarity with containerization (Docker) and orchestration (Kubernetes) for AI workloads
  • Strong background in data quality implementation for AI training data
  • Experience working with distributed teams and cross-functional collaboration
  • Knowledge of data security and governance practices for AI systems
  • Experience working on analytics projects with Agile and Scrum Methodologies

Responsibilities

  • Architecture & Technical Leadership Architect end-to-end data pipelines processing terabytes of IoT telemetry on Azure Databricks (PySpark DLT, Lakeflow) using medallion Lakehouse architecture.
  • Design and optimize real-time ingestion pipelines from Azure Event Hub and Apache Kafka for high-volume industrial IoT telemetry.
  • Build fault-tolerant, idempotent streaming architectures handling schema evolution, backpressure, and latency SLAs.
  • Lead architecture reviews, set engineering standards, and drive decisions on data modeling, pipeline design, and platform evolution.
  • Define technical direction for AI-ready data products including vector stores, embedding pipelines, and RAG-ready structured/unstructured data.
  • Adopt emerging LLM orchestration frameworks (LangChain, LangGraph) to accelerate GenAI platform capabilities.
  • GenAI & AI Pipeline Development Build production GenAI pipelines- RAG workflows, document ingestion, PII anonymization and vector database infrastructure.
  • Collaborate with data scientists and AI engineers to deliver high-quality, AI-ready datasets that improve downstream model performance.
  • DevOps, Security & Governance Enforce data governance, access control, and security policies; lead PII detection and anonymization strategies across the data platform.
  • Champion CI/CD practices using GitHub Actions, DAB, Octopus, and Bamboo for automated, reliable pipeline delivery.
  • Ensure compliance with enterprise security standards within the SDLC.
  • Team Development & Stakeholder Engagement Mentor engineers across seniority levels through code reviews, pairing, and technical coaching.
  • Translate business and AI product requirements into clear technical roadmaps and execution plans.
  • Partner with data scientists, product owners, and architects to align data investments with Honeywell's autonomy strategy.

Benefits

  • In addition to a competitive salary, leading-edge work, and developing solutions side-by-side with dedicated experts in their fields, Honeywell employees are eligible for a comprehensive benefits package.
  • This package includes employer subsidized Medical, Dental, Vision, and Life Insurance; Short-Term and Long-Term Disability; 401(k) match, Flexible Spending Accounts, Health Savings Accounts, EAP, and Educational Assistance; Parental Leave, Paid Time Off (for vacation, personal business, sick time, and parental leave), and 12 Paid Holidays.
  • For more information visit: Benefits at Honeywell
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service