Lead Data Engineer

RE Partners

5d•$100,000 - $250,000

About The Position

What You’ll Do Design and build Spark data ETL pipelines on AWS data platform. Collaborate with cross functional teams such as data scientists, fraud, marketing and other business stakeholders to understand their data needs and deliver reliable solutions. Optimize data infrastructure - Design and maintain robust data infrastructure by using modern data platform architecture. Ensure data quality and reliability. Innovate and follow best practices. Ensure operational excellence of the data platform, including monitoring, incident response, performance optimization, and continuous improvement. Who We’re Looking For (“Must Haves”) Professional experience working in data warehousing, data architecture, and/or data engineering environments, especially using spark, hadoop, hive etc with solid understanding of streaming pipelines. Proficiency in at least one high-level programming language (Scala, Java, Python or equivalent) Good understanding of databases You have built large-scale data products and understand the tradeoffs made when building these features You have a deep understanding of system design, data structures, and algorithms You have an excellent knowledge of distributed computing frameworks such as Hadoop MapReduce, Spark You have a strong knowledge of following AWS infrastructure - EMR, S3, Lambda, Redshift etc You have strong understanding of data quality, governance You are a team player, self-driven, highly motivated individual who loves to learn new things

Requirements

Professional experience working in data warehousing, data architecture, and/or data engineering environments, especially using spark, hadoop, hive etc with solid understanding of streaming pipelines.
Proficiency in at least one high-level programming language (Scala, Java, Python or equivalent)
Good understanding of databases
You have built large-scale data products and understand the tradeoffs made when building these features
You have a deep understanding of system design, data structures, and algorithms
You have an excellent knowledge of distributed computing frameworks such as Hadoop MapReduce, Spark
You have a strong knowledge of following AWS infrastructure - EMR, S3, Lambda, Redshift etc
You have strong understanding of data quality, governance
You are a team player, self-driven, highly motivated individual who loves to learn new things

Responsibilities

Design and build Spark data ETL pipelines on AWS data platform.
Collaborate with cross functional teams such as data scientists, fraud, marketing and other business stakeholders to understand their data needs and deliver reliable solutions.
Optimize data infrastructure - Design and maintain robust data infrastructure by using modern data platform architecture.
Ensure data quality and reliability.
Innovate and follow best practices.
Ensure operational excellence of the data platform, including monitoring, incident response, performance optimization, and continuous improvement.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume