About The Position

Nextpower is seeking a highly motivated summer intern to join the Data Engineering team to help build and operate production data pipelines that power analytics and operational insights from power plant telemetry. This role will report to a data engineering leader and will focus on developing reliable, cost-efficient streaming and lakehouse pipelines on Azure. This internship offers direct, hands-on experience delivering real production data systems—working with streaming ingestion, Delta Lake lakehouse patterns (Bronze/Silver/Gold), and data quality engineering. Interns will work closely with experienced engineers and stakeholders, receive meaningful mentorship, and gain exposure to how data products are designed, delivered, and supported in production.

Requirements

  • Currently pursuing a Bachelor’s or Master’s degree in Computer Science, Data Engineering, Software Engineering, or a related field (or equivalent practical experience)
  • Proficiency in Python and/or SQL, with the ability to write clean, maintainable code
  • Strong understanding of data fundamentals (schemas, data modeling basics, transformations, and data reliability concepts)
  • Exposure to distributed systems and/or big data concepts (e.g., Spark, streaming, or parallel processing)
  • Strong analytical and troubleshooting skills; ability to debug issues using logs, metrics, and data inspection
  • Excellent written and verbal communication skills, including comfort documenting work clearly
  • Comfortable working in a fast-paced, high-performance environment with mentorship and feedback

Nice To Haves

  • Experience with Azure (Event Hubs, ADLS Gen2) or cloud data platforms (AWS/GCP equivalents)
  • Experience with Databricks, Delta Lake, or lakehouse architectures
  • Exposure to streaming (Kafka or similar) and event-driven data pipelines
  • Familiarity with data quality practices (tests, validation, data contracts) and CI/CD

Responsibilities

  • Build and operate telemetry data pipelines from ingestion through curated lakehouse tables
  • Implement streaming ingestion using Kafka concepts (e.g., Azure Event Hubs or similar technologies)
  • Develop Databricks lakehouse pipelines producing Bronze/Silver/Gold Delta tables (or equivalent patterns)
  • Enforce schema validation, data contracts, and data quality checks (e.g., expectations/tests)
  • Handle real-world pipeline issues such as late-arriving data, duplicates, out-of-order events, and schema drift
  • Optimize pipelines for performance, reliability, and cost
  • Create clear documentation and runbooks to support owned pipelines in production
  • Collaborate with engineers and stakeholders to translate requirements into robust, maintainable data products

Benefits

  • Hands-on experience building and supporting production-grade data pipelines for real telemetry systems
  • Practical exposure to Azure-based data platform architecture and streaming fundamentals
  • Experience implementing lakehouse design principles using Bronze/Silver/Gold patterns (or equivalent)
  • Real-world data reliability engineering skills: handling late data, duplicates, schema evolution, and correctness
  • Mentorship from experienced engineers and insight into how data engineering supports business and operational outcomes
  • Strong portfolio-worthy deliverables: production pipeline components, documentation, and measurable improvements
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service