Principal Data Engineer

AderantAtlanta, GA
4d

About The Position

We are seeking a Principal Data Engineer to lead the design, development, and optimization of our cloud-native data platform. You will be responsible for architecting scalable ETL pipelines, mentoring engineers, and driving technical decisions that shape our data infrastructure. This is a hands-on leadership role requiring deep expertise in AWS data services, distributed computing, and modern data lakehouse architectures.

Requirements

  • 8+ years of experience in data engineering, with 3+ years in a senior or lead capacity
  • Proven track record of designing and operating large-scale data platforms in production
  • Experience leading technical projects and mentoring engineers
  • Proficiency with AWS data engineering services including but not limited to:
  • Data Movement & Integration: DMS (Database Migration Service), SQS, Lambda
  • Data Processing: AWS Glue, EMR, Step Functions
  • Data Storage: S3, DynamoDB, Redshift
  • Governance & Observability: DataZone, CloudWatch, CloudTrail
  • Expert-level proficiency in Python and SQL (Spark SQL, T-SQL, or similar)
  • Deep experience with Apache Spark (PySpark) for distributed data processing
  • Strong knowledge of data lake table formats: Apache Iceberg, Delta Lake, or Apache Hudi
  • Proficiency with dimensional modeling and data warehouse design patterns
  • Experience with infrastructure as code and CI/CD pipelines (GitHub Actions, Terraform, or CloudFormation)
  • Familiarity with data serialization formats (Parquet, Avro, JSON)
  • Experience designing medallion architectures or similar tiered data processing patterns
  • Understanding of CDC (Change Data Capture) patterns and event-driven architectures
  • Knowledge of data lineage, cataloging, and metadata management
  • Experience implementing row-level security and data access controls

Nice To Haves

  • Experience with observability frameworks such as OpenTelemetry
  • Familiarity with data validation libraries (Pydantic, Great Expectations)
  • Experience with async Python (asyncio, aioboto3) for high-throughput applications
  • Knowledge of Kubernetes and containerized workloads
  • Experience with data mesh or data product architectures
  • Background in legal, financial, or enterprise SaaS domains

Responsibilities

  • Technical Leadership
  • Architect and evolve our medallion-based data lakehouse (Bronze/Silver/Gold tiers) on AWS
  • Design and implement data transformation pipelines that scale to handle petabytes of data
  • Establish best practices for data modeling, including dimensional modeling (Fact/Dimension tables) and slowly changing dimensions
  • Define and enforce data quality, governance, and security standards across the platform
  • Lead technical design reviews and provide guidance on complex engineering challenges
  • Hands-On Engineering
  • Build and maintain production-grade ETL pipelines using AWS Glue, PySpark, and Apache Iceberg
  • Develop reusable Python libraries and frameworks for data processing and transformation
  • Implement data lineage tracking and query optimization strategies
  • Design event-driven data architectures using Step Functions, Lambda, and SQS
  • Optimize Spark jobs for performance, cost efficiency, and reliability
  • Collaboration & Mentorship
  • Mentor and coach data engineers, fostering a culture of technical excellence
  • Partner with Data Scientists, Analytics Engineers, and Product teams to understand data requirements
  • Collaborate with Platform and DevOps teams on CI/CD, observability, and infrastructure automation
  • Contribute to architectural decisions and technical roadmap planning
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service