Health Systems Data Quality Engineer

Apple•Sunnyvale, CA

About The Position

This role is essential to enabling Health Algorithms research by ensuring high quality data at scale. You will be collaborating with multi-functional teams such as the data engineering, data science, algorithms, design, operations and QA teams to test, validate and monitor data platforms and the data flowing through them. To do that, you will need to design, develop and maintain/own software services and test automation frameworks. Your work will have direct and essential impact to future Apple products. - Design, own and maintain scalable and robust test automation frameworks, create services and tools for testing and monitoring of data processing infrastructure and of data monitoring reports. - Provide feedback and reporting on data pipelines and inform partnering teams of issues or progress. - Test the functionality and accuracy of data pipelines and data monitoring tools by leveraging Python and Spark. - Understand complex software and data processing systems and be able to break the systems down into smaller components in order to deep dive into the implementation to ensure robust and reliable data processing produces the highest quality of data.

Requirements

BS degree in Computer Science, Data Science, or related
Experience with Python

Nice To Haves

2+ years of relevant industry experience
Strong software development skills, with 2+ years experience programming in Python
Experience with Apache Spark/PySpark, Apache Airflow, Pandas/Numpy
Experience with infrastructure test automation and CI/CD workflows
Experience with data visualization
Familiarity with cloud-based infrastructure such as AWS
Practical experience with SQL and NoSQL databases (MySQL, Postgres, MongoDB, etc)
Quality driven, detail oriented, with excellent interpersonal skills

Responsibilities

Design, own and maintain scalable and robust test automation frameworks, create services and tools for testing and monitoring of data processing infrastructure and of data monitoring reports.
Provide feedback and reporting on data pipelines and inform partnering teams of issues or progress.
Test the functionality and accuracy of data pipelines and data monitoring tools by leveraging Python and Spark.
Understand complex software and data processing systems and be able to break the systems down into smaller components in order to deep dive into the implementation to ensure robust and reliable data processing produces the highest quality of data.

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume