Principal Associate, Data Scientist - Anti-Money Laundering

Capital One•McLean, VA

2d•$147,100 - $184,600

About The Position

Principal Associate, Data Scientist - Anti-Money Laundering Data is at the center of everything we do. As a startup, we disrupted the credit card industry by individually personalizing every credit card offer using statistical modeling and the relational database, cutting edge technology in 1988! Fast-forward a few years, and this little innovation and our passion for data has skyrocketed us to a Fortune 200 company and a leader in the world of data-driven decision-making. As a Data Scientist at Capital One, you’ll be part of a team that’s leading the next wave of disruption at a whole new scale, using the latest in computing and machine learning technologies and operating across billions of customer records to unlock the big opportunities that help everyday people save money, time and agony in their financial lives. Team Description The Anti-Money Laundering (AML) Modeling and Advanced Data Insights team is on a journey to modernize the way Capital One identifies potential money laundering, fraud, terrorist financing, and human trafficking through the use of advanced analytic techniques, statistics, and machine learning models. We develop predictive models, monitoring dashboards, and reporting using tools such as AWS, Snowflake, Python, and Spark. Our team produces the model outputs and data insights to operate our AML program efficiently and effectively. As the model developers for advancing transaction monitoring and customer risk rating with machine learning, our team is responsible for end to end development, deployment, and monitoring of production models. Role Description In this role, you will: Partner with a cross-functional team of data scientists, software engineers, business analysts, risk managers, and product owners to deliver industry-leading risk management products Leverage a broad stack of tools and technologies — Python, Conda, AWS, Spark, dbt, and more — to build production-ready pipelines for data sourcing, model development, and model scoring Build machine learning models and AI tools through all phases of development, from design through training, evaluation, validation, and implementation Fine tune, evaluate, customize, and productionize Large Language Models (LLMs) Flex your interpersonal skills to translate the complexity of your work into tangible business goals The Ideal Candidate is: Innovative. You continually research and evaluate emerging technologies. You stay current on published state-of-the-art methods, technologies, and applications and seek out opportunities to apply them. Creative. You thrive on bringing definition to big, undefined problems. You love asking questions and pushing hard to find answers. You’re not afraid to share a new idea. Technical. You’re comfortable with open-source languages and are passionate about developing further. You have hands-on experience developing data science solutions using open-source tools and cloud computing platforms. Statistically-minded. You’ve built models, validated them, and backtested them. You know how to interpret a confusion matrix or a ROC curve. You have experience with clustering, classification, sentiment analysis, time series, and deep learning.

Requirements

Currently has, or is in the process of obtaining one of the following with an expectation that the required degree will be obtained on or before the scheduled start date: A Bachelor's Degree in a quantitative field (Statistics, Economics, Operations Research, Analytics, Mathematics, Computer Science, or a related quantitative field) plus 5 years of experience performing data analytics
A Master's Degree in a quantitative field (Statistics, Economics, Operations Research, Analytics, Mathematics, Computer Science, or a related quantitative field) or an MBA with a quantitative concentration plus 3 years of experience performing data analytics
A PhD in a quantitative field (Statistics, Economics, Operations Research, Analytics, Mathematics, Computer Science, or a related quantitative field)

Nice To Haves

Master’s Degree in “STEM” field (Science, Technology, Engineering, or Mathematics) plus 3 years of experience in data analytics, or PhD in “STEM” field (Science, Technology, Engineering, or Mathematics)
At least 2 years’ experience in AML modeling or related domain (e.g. Fraud, Credit Risk, etc.)
At least 1year of experience developing and evaluating production-grade GenAI, Agentic AI, and/or LLMs based systems, including experience with vector databases, LLM fine tuning, RAG, and use of LangGraph or LlamaIndex
At least 1 year of experience working with AWS
At least 3 years’ experience in Python and SQL
At least 3 years’ experience with machine learning

Responsibilities

Partner with a cross-functional team of data scientists, software engineers, business analysts, risk managers, and product owners to deliver industry-leading risk management products
Leverage a broad stack of tools and technologies — Python, Conda, AWS, Spark, dbt, and more — to build production-ready pipelines for data sourcing, model development, and model scoring
Build machine learning models and AI tools through all phases of development, from design through training, evaluation, validation, and implementation
Fine tune, evaluate, customize, and productionize Large Language Models (LLMs)
Flex your interpersonal skills to translate the complexity of your work into tangible business goals

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume