About The Position

We are looking for a Senior Infrastructure Engineer (5-10yrs experience) to bridge the gap between internal developer velocity and global product delivery by owning the entire "Commit-to-Compute" lifecycle. You will build the high-performance CI/CD pipelines and self-service tooling that accelerate our internal C++/Python builds (The Inner Loop), while simultaneously designing and implementing the "run-anywhere" distributed substrate—leveraging Kubernetes, Helm, and task orchestration—to ensure our platform deploys seamlessly across local, on-prem, and multi-cloud environments (The Outer Loop). You aren't just managing servers, you are building a product. You will partner with our 3 Engineering Managers and our Infrastructure/QA teams to redefine developer experience and define the next generation of how nTop is consumed by the world's leading aerospace, automotive, and medical device companies.

Requirements

  • Kubernetes & Docker: Deep experience building production-grade containerized environments and managing K8s resources (Deployments, Jobs, HPA).
  • CI/CD Engineering: Expert at managing Jenkins (Shared Libraries, Pipelines-as-Code) and integrating with GitHub for automated testing and release management.
  • Infrastructure as Code: Proficiency with Terraform for managing cloud-agnostic infrastructure.
  • Scripting & Tooling: Strong proficiency building developer productivity tools, automation frameworks, scripts, and worker services.

Nice To Haves

  • Experience with GPU-accelerated workloads.
  • Experience with AI/ML technologies, Evals Infra

Responsibilities

  • Scalable Automation Frameworks: Architect modular, reusable CI/CD and distributed testing frameworks that provide rapid, automated feedback loops for code quality, security, and performance directly within the developer workflow
  • Engineering Self-Service & Tooling: Design and maintain internal developer platforms, frameworks and tools ecosystem that automates builds, tests, releases.
  • Design & Build Distributed Systems: Build the task-queue and orchestration layer to manage nTop worker lifecycles.
  • Hybrid-Cloud Infrastructure: Develop "run-anywhere" deployment patterns using Kubernetes, Helm, and Terraform that work seamlessly across AWS, GCP, and on-premise hardware.
  • Container Security & Isolation: Implement secure execution environments for untrusted code/designs using technologies like gVisor or Kata Containers.
  • Production CI/CD: Design and maintain high-performance build pipelines in Jenkins and GitHub Actions
  • Deployment Tooling: Develop and support Helm charts and Terraform modules that allow our clients to deploy in their own private clouds or on-prem data centers.
  • Storage Abstraction: Build or integrate S3-compatible storage layers to handle high-throughput telemetry and design file I/O.

Benefits

  • Outstanding PTO and leave policy
  • ISO options
  • Healthcare: Medical Dental and Vision plans
  • 401k with generous matching
  • Annual stipend for continued career learning/ development
  • Commuter benefits for NY based hires
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service