Principal Software Engineer (AI)

SAP Taulia

1d•Remote

About The Position

We are seeking a high-caliber Principal Software Engineer to serve as the Technical Lead (TL) and Technical Authority for team AI initiatives. In this role, you are the technical execution leader for the team, owning the technical direction for the team’s scope and ensuring the delivery of maintainable, high-quality AI solutions. You will work closely with the architecture team, contributing to the design of our AI ecosystem. As a master practitioner, you will balance the "How" of sophisticated GenAI—including agentic workflows and RAG architecture—with the operational rigor required for safe and efficient delivery. You will lead by example, breaking down the most complex technical hurdles in the AI stack and raising the technical bar for the entire team.

Requirements

Proven Technical Leadership: Extensive experience as a Principal Engineer or Technical Lead with a track record of owning technical execution and delivery for complex software teams.
Advanced AI Product Engineering: Mastery of Generative AI implementation patterns, including agentic orchestration, prompt engineering, and RAG.
Software Engineering Excellence: Expert-level proficiency in Java and Spring AI Core, with deep knowledge of microservices, distributed systems, and modern testing discipline.
LLMOps & Infrastructure: Hands-on experience with Vector Databases, model monitoring, and the infrastructure required to scale AI safely.
System Design & Risk Mitigation: Strong ability to design for "non-deterministic" outputs, defining constraints and verification processes that ensure AI reliability.
Communication & Influence: Ability to distill complex AI architectures into actionable plans and clear standards that enable the team to execute with high velocity
Bachelor's Degree or equivalent practical experience.

Nice To Haves

Operational Knowledge: Experience with cost-optimization and latency-reduction strategies for high-volume LLM deployments.
Fine-tuning: Practical experience with fine-tuning open-source models (e.g., Llama, Mistral) for specific domain tasks.
FinTech Background: Experience building high-compliance, secure software within the financial services industry.
CloudOps Infrastructure: Familiarity with CloudOps infrastructure, CI/CD automation, and cloud-native scaling strategies.

Responsibilities

Technical Direction & Design: Collaborate closely with the architecture team, contributing to and maintaining the technical direction for the team’s AI scope. Drive designs for AI services that balance time-to-value with reliability, security, and maintainability, ensuring alignment with architectural guardrails and platform constraints.
Technical Execution & Delivery Enablement: Act as the primary driver of execution on the team. Break down complex AI technical work into executable increments and identify technical risks early. Actively remove engineering blockers such as build/deploy friction, unclear interfaces, and excessive coupling.
Agent Architect: Collaborate with the architecture team to design and build the AI agents and autonomous frameworks that scale the team’s intent. You are responsible for the logic that allows AI to perform complex, multi-step tasks reliably while ensuring these systems are built for long-term maintainability.
Quality & Maintainability: Uphold the highest standards for code quality, testing discipline, and AI evaluation. Drive technical debt management and refactoring for AI systems, aligned with business priorities and risk.
AI Measurement & Reliability (LLMOps): Collaborate with the architecture team and contribute to the evaluation frameworks used to measure model effectiveness and reliability. Define the metrics and observability standards that ensure our AI products stay healthy and "safe" in production.
Mentorship & Technical Leadership: Mentor engineers to raise the technical bar across the team. Lead technical design reviews and establish reusable patterns so that other engineers can contribute to the AI stack effectively and safely.
Incident Leadership (Technical): Act as the primary technical leader during AI-related incidents. Direct triage, define debugging strategies, and oversee safe mitigation and recovery plans for non-deterministic AI systems.