About The Position

GitHub Revenue is growing its Data Science team and we're seeking experienced professionals to elevate our data and analytics efforts. As a Senior Data Scientist in Revenue, you will leverage your deep expertise and knowledge of data science, machine learning, and business to lead data acquisition efforts, conduct thorough review of data analysis and data quality, form hypotheses and discover insights in the data to support business stakeholders and their decision making. You will provide feedback to the engineering team to identify potential future business opportunities, and track advances in industry and academia to adapt algorithms and techniques to drive innovation and develop new solutions. The ideal candidate will contribute to the impact of our Data Science initiatives and gain deep insights into the latest advancements in AI, machine learning and data science.

Requirements

  • Bachelor's Degree in Data Science, Mathematics, Physics, Statistics, Economics, Operations Research, Computer Science, or related field AND 5+ years experience in data science (e.g., managing structured and unstructured data, applying statistical techniques) or related field OR Master's Degree in Data Science, Mathematics, Physics, Statistics, Economics, Operations Research, Computer Science, or related field AND 3+ years experience in data science (e.g., managing structured and unstructured data, applying statistical techniques) or related field OR Doctorate in Data Science, Mathematics, Physics, Statistics, Economics, Operations Research, Computer Science, or related field AND 1+ year(s) experience in data science (e.g., managing structured and unstructured data, applying statistical techniques) or related field OR equivalent experience
  • 3 + years of experience in programming languages such as Python or R, experience with query languages such as SQL and KQL, and with data manipulation tools like Spark and Airflow

Nice To Haves

  • Technical understanding of data science techniques for regression, classification, time-series analysis, experimental design, causal inference
  • Able to clearly communicate findings to non-technical stakeholders through storytelling and visualization with tools like Jupyter notebooks or Azure Data Explorer / PowerBI dashboards

Responsibilities

  • Lead data acquisition efforts and ensure data is properly formatted and accurately described, while adhering to GitHub's privacy policies
  • Mentor others in data cleaning and data analysis best practices. Identify gaps in current data sets and drive onboarding of new data sets from production systems or third-party vendors.
  • Resolve data integrity problems in collaboration with relevant teams to promote upstream change and long-term quality
  • Leverage broad and deep knowledge of modeling techniques, AI/ML tools, programming languages and query languages to create models, conduct experiments, analyze results, evaluating the methodology and performance of team members' models and recommending improvements. Anticipate the risks of data leakage, bias/variance tradeoff, and methodological limitations.
  • Drive best practices relative to model validation, implementation, and application, and partners with teams across the organization to identify and explore new opportunities for driving transformative solutions for our stakeholders and customers.
  • Develop and articulate data-driven strategies in consideration of business priorities and lead conversations with end customers and/or internal stakeholders to understand, define, and solve business problems.
  • Track advances in industry and academia, and adapt algorithms and/or techniques to drive innovation and develop new solutions. Serves as a subject matter expert and mentor for team members.
  • Communicate complex statistics, and machine learning topics to diverse audiences (e.g., multidisciplinary teams, customers, technical and non-technical audiences)
  • Independently writes efficient, readable, extensible code that spans multiple features/solutions. Contributes to the code/model review process by providing feedback and suggestions for implementation and improvement.
  • Drive operational excellence for model deployment (i.e. performance, scalability, monitoring, maintenance, integration into engineering production system, stability)
  • Produce project plans to define necessary steps required for completion, leading to a measurable improvement in business performance metrics over time. Utilize project results to decide on next steps (e.g., deployment, further iterations, new projects).
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service