Senior Lead Data Engineer (R4680)

Shield AI•San Francisco, CA

10h•$140,000 - $210,000

About The Position

Shield AI is building a modern financial enterprise reporting and analytics platform to support a complex, R&D-intensive, multi-billion-dollar organization. As a Senior Data Engineer, you will architect and implement secure, scalable, and cost-efficient data systems in Microsoft Fabric that power Finance, FP&A, Accounting, Program Management, and Executive reporting. This role goes beyond traditional ETL development. You will own cloud architecture decisions, drive data modeling standards, embed advanced analytics readiness into the platform, and ensure that our systems are built for auditability, governance, and scale. In a lean team, you will operate across engineering, analytics, and applied data science enablement. You will design the backbone of a mission-critical analytics platform that supports executive decision-making, forecast accuracy, compliance reporting, and scalable AI adoption. Your work will directly influence cost control, capital allocation, and operational insight across the enterprise.

Requirements

7+ years of experience in data engineering or cloud data architecture roles.
Expert-level proficiency in SQL and Python; strong hands-on experience with PySpark.
Production experience in Microsoft Fabric, Databricks, Snowflake, or comparable cloud data platforms.
Strong understanding of dimensional modeling, star schemas, and analytics engineering best practices.
Experience designing secure, compliance-aware data platforms in regulated environments.
Demonstrated ability to independently own architectural decisions and deliver enterprise-grade systems.
Working knowledge of statistics, model evaluation concepts, and ML production workflows.

Nice To Haves

Experience with Azure or AWS cloud-native services (IAM, storage, networking, security controls).
Hands-on exposure to ML model deployment pipelines and feature engineering workflows.
Experience with finance systems (ERP, FP&A platforms, project accounting systems).
Familiarity with governance frameworks (DCAA, FAR/DFARS, CUI) and audit support processes.
Experience building CI/CD pipelines for data systems using GitHub or equivalent tools.

Responsibilities

Architect and implement end-to-end ELT/ETL pipelines (Python, SQL, PySpark) in Microsoft Fabric Lakehouse environments, with prior experience in Azure Data Factory, Synapse, or comparable Azure-native data platforms.
Design and optimize medallion architectures (raw, curated, semantic) for enterprise-scale financial and corporate data.
Build secure, automated ingestion frameworks integrating ERP, FP&A, project accounting, and operational systems.
Design orchestration and monitoring frameworks to ensure reliability, observability, and audit readiness.
Optimize Fabric compute unit (CU) usage, query folding strategies, partitioning, indexing, and workload isolation.
Develop production-grade data models that support BI, self-service analytics, and advanced statistical modeling.
Collaborate on productionizing ML-enabled workflows such as GL classification, anomaly detection, and forecasting.
Implement data quality validation, reconciliation automation, and governance controls aligned with compliance standards.
Partner with Finance and Product stakeholders to translate business needs into scalable technical solutions.
Mentor engineers and contribute to engineering standards, CI/CD, code review practices, and documentation.