The ML problems that define the future of cloud cost-per-anything CloudZero is the cost-per-anything model for cloud and Al - for humans and the agents spend they deploy. We're inverting cost intelligence: from billing-first to telemetry-first. Every CloudZero is inverting the traditional cost intelligence model. Engineering decision is a buying decision - Instead of starting from the monthly bill, we're building toward and we're building the platform that proves it in a telemetry-first platform — lightweight collection agents real time.inside customer environments, capturing every Al inference event, cloud resource usage, and product telemetry signal in Telemetry-FirstCost-to-Produce Al Inference Agentic Governance ML-Powered real time. That data is reconciled against billing to produce total cost-to-produce intelligence. Not just COGS. The full picture. Al is making every company look like a multi-tenant SaaS. Every enterprise now has per-model, per-token, per-customer Al inference complexity — and no one has a real-time answer for how to measure, govern, and optimize it. CloudZero is building that answer: a multi-tier architecture spanning real-time streaming (Kafka, Flink/KStreams), batch billing reconciliation, and an intelligent governance layer for both human engineers and the autonomous agents they deploy. Most of what makes this role extraordinary is what we're building next. This is a founding technical engineer role. You won't be managing a team on day one — you'll be anchoring one. You'll set the technical patterns, solve the hardest data science problems in the product, and help build the team around you. The vision: CloudZero becomes the cost-per-anything model for cloud and Al — for humans and the agents they deploy. 6 hard ML problems. They sit at the intersection of financial telemetry, cloud infrastructure, Al inference, and massive scale. Some are live in product today; several are what we're building next. Real-time Unit Economics: Calculate per-unit costs across millions of transactions with dynamic efficiency management Predictive Cost Intelligence: Predict and prevent cost efficiency breaches before they impact business Multi-Cloud Attribution: Accurately attribute cloud spend across complex systems using probabilistic modeling Autonomous Optimization: Build AI agents that make safe infrastructure changes within business constraints
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed