AI Infrastructure Engineer

General Dynamics Mission Systems, Inc
1d

About The Position

ROLE AND POSITION OBJECTIVES: Intelligent Data Management: Use AI tools to analyze, map, and automate the data migration from the existing wrokflows and system Design modern, flexible data architectures, not locked to legacy patterns Leverage AI to detect data quality issues, validate migration results, and optimize database performance Partner with the development team to auto-tune queries and optimize storage architecture Automated Deployment & Self-Healing Infrastructure: Build fully automated CI/CD pipelines with AI-powered testing, canary analysis, and automatic rollback Deploy and manage applications using Kubernetes and container orchestration with infrastructure-as-code Build self-healing systems that detect, diagnose, and resolve common issues automatically, only escalating truly novel problems to humans Automate environment provisioning, scaling, and configuration so nothing is done manually AI-Powered Automation & Integration: Use AI to automate operational tasks, eliminate manual, repetitive work that doesn't scale Build intelligent integration management between the new application and existing enterprise systems Leverage AI-assisted tools to generate, optimize, and maintain deployment tooling Auto-generate documentation from code, configurations, and system behavior Proactive Reliability Engineering: Implement AI-powered observability that detects patterns, predicts failures, and suggests or executes fixes automatically Build systems where the lights rarely flicker, engineer reliability into the architecture, not bolt it on after Use AI to analyze incident patterns and build preventative measures that eliminate entire classes of failures Establish and track Service Level Objectives (SLOs) using automated data collection and reporting Why This Role Matters: This is not a typical infrastructure job. You are building the deployment backbone for a strategic initiative that will define how GDMS builds and operates software going forward. The CI/CD pipelines, the data migration patterns, the production infrastructure you create here will become the blueprint for the future in-house development project. Why This Role is Worth It: You'll build from scratch. No legacy infrastructure to inherit. You design it, you build it, you own it. You'll work with cutting-edge tools. AI-assisted development, modern deployment practices, Kubernetes orchestration this is a modern tech stack, not a legacy maintenance job. You'll have impact. Your work directly determines whether a mission-critical system succeeds in production. You'll have support. Leadership is fully committed to this initiative and will clear roadblocks so you can focus on engineering.

Requirements

  • Requires a Bachelor’s degree in Software Engineering, or a related Science, Engineering, Technology or Mathematics field. Also requires 8+ years of job-related experience, or a Master's degree plus 6 years of job-related experience. Agile experience preferred.
  • Department of Defense Secret security clearance is required at time of hire. Applicants selected will be subject to a U.S. Government security investigation and must meet eligibility requirements for access to classified information. Due to the nature of work performed within our facilities, U.S. citizenship is required.
  • Bachelor's degree in Computer Science, Software Engineering, Information Technology, or a related field, plus a minimum of 5 years of relevant experience; or Master's degree plus a minimum of 3 years of relevant experience
  • Demonstrated expertise with relational databases (Oracle, PostgreSQL, MySQL, or similar)
  • Hands-on experience deploying and managing applications with Kubernetes
  • Experience building and maintaining CI/CD pipelines (Jenkins, GitLab CI, GitHub Actions, or similar)
  • AWS RDS or other cloud-managed relational database services
  • Cloud-managed Kubernetes (AWS EKS, Azure AKS, GCP EKS)

Nice To Haves

  • Experience with data migration from legacy workflows and systems
  • Proficiency with infrastructure-as-code tools (Terraform, Ansible, or similar)
  • Experience with container technologies (Docker, Helm charts)
  • Familiarity with cloud platforms (AWS, Azure, or GCP)
  • Experience with monitoring and observability tools (Prometheus, Grafana, ELK stack, or similar)
  • Python scripting experience for automation and tooling
  • Understanding of Site Reliability Engineering (SRE) principles
  • Experience working in DoD or regulated environments

Responsibilities

  • Use AI tools to analyze, map, and automate the data migration from the existing wrokflows and system
  • Design modern, flexible data architectures, not locked to legacy patterns
  • Leverage AI to detect data quality issues, validate migration results, and optimize database performance
  • Partner with the development team to auto-tune queries and optimize storage architecture
  • Build fully automated CI/CD pipelines with AI-powered testing, canary analysis, and automatic rollback
  • Deploy and manage applications using Kubernetes and container orchestration with infrastructure-as-code
  • Build self-healing systems that detect, diagnose, and resolve common issues automatically, only escalating truly novel problems to humans
  • Automate environment provisioning, scaling, and configuration so nothing is done manually
  • Use AI to automate operational tasks, eliminate manual, repetitive work that doesn't scale
  • Build intelligent integration management between the new application and existing enterprise systems
  • Leverage AI-assisted tools to generate, optimize, and maintain deployment tooling
  • Auto-generate documentation from code, configurations, and system behavior
  • Implement AI-powered observability that detects patterns, predicts failures, and suggests or executes fixes automatically
  • Build systems where the lights rarely flicker, engineer reliability into the architecture, not bolt it on after
  • Use AI to analyze incident patterns and build preventative measures that eliminate entire classes of failures
  • Establish and track Service Level Objectives (SLOs) using automated data collection and reporting
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service