Linux System Administrator

Savannah River National LaboratoryAiken, SC
2d

About The Position

Savannah River National Laboratory (SRNL) is seeking an experienced Linux Systems Administrator who will be responsible for the engineering, implementation, and sustainment of enterprise Linux platforms that support SRNL research and mission operations. The role owns the secure, reliable, and auditable operation of Linux systems and supporting services across their lifecycle—standard builds, patching and vulnerability remediation, configuration management, monitoring/incident response support, and backup/recovery—aligned to DOE cybersecurity expectations and NIST control-driven governance. This role is for an experienced Linux administrator and the successful candidate is expected to work independently on complex issues, elevate team practices through automation, and partner effectively with developers and service owners to modernize deployments for both commercial off‑the-shelf (COTS) and homegrown applications.

Requirements

  • Bachelor's Degree and 5+ years of Linux systems administration experience in production environments (mid-level benchmark), with senior consideration typically reflecting broader scope and leadership beyond that baseline.
  • Demonstrated ability to patch, troubleshoot, and remediate vulnerabilities and operational issues in heterogeneous enterprise environments.
  • Professional experience automating administration tasks with Ansible and scripting (Bash and/or Python).
  • Experience administering Linux in virtualized infrastructure (e.g., VMware-hosted Linux workloads).
  • Experience implementing monitoring/alerting and performing triage and root-cause analysis.
  • Hands-on experience supporting application environments across multiple lifecycle tiers (Development/Staging/Production), including COTS and custom applications; ability to partner with developers to deliver stable platforms.
  • Working knowledge of security and compliance practices aligned to NIST controls and STIG/SCAP-style baselines, including evidence-based validation.
  • Proven experience working managing both RHEL (including Satellite), Ubuntu, Ansible and Automation Controller, Bash/Python, VMware vSphere, SAN/NAS concepts, OpenSCAP/STIG, identity integration (AD/LDAP), backup/DR, monitoring/alerting/logging, and container/Kubernetes platforms where applicable.

Nice To Haves

  • Red Hat Satellite administration experience for lifecycle content and patch management at scale (content views, lifecycle environments, Capsule concepts).
  • Experience in deploying, configuring, and maintaining Ubuntu Server environments for high-availability services and understanding secure server deployments, and automation using Ansible or Bash scripting.
  • Experience using Automation Controller to operationalize automation with RBAC, auditing controls, and workflow-based release/operations orchestration.
  • Demonstrated experience hardening systems and validating compliance using OpenSCAP/SCAP Security Guide profiles aligned to STIG (or similar baselines), including remediation automation.
  • Experience with identity integration patterns for Linux and applications (SSO, certificates, service authentication), including troubleshooting cross-system authentication issues.
  • Container/Kubernetes experience (deploying/operating production clusters or supporting workloads on those platforms), particularly where modernization efforts benefit from standardized packaging and orchestration.
  • Storage and recovery depth: experience with enterprise storage concepts and backup/recovery validation aligned to “restore works” expectations.

Responsibilities

  • Engineer, deploy, patch, and troubleshoot production Linux systems (virtual and/or physical), including standardized build practices and disciplined lifecycle management to maintain availability and security.
  • Operate an OS patch and vulnerability remediation program with measurable outcomes (coverage, timeliness, rollback readiness), using automation and centralized tooling where appropriate (e.g., Red Hat Satellite for lifecycle-stage patching).
  • Build and maintain automation for configuration, patching, and operational tasks using Ansible; treat automation as a product (versioned, reviewed, tested) rather than one-off scripts.
  • Operationalize automation at enterprise scale using Automation Controller capabilities (RBAC, auditing, workflows, credential handling) to safely delegate and repeat administrative actions.
  • Implement security hardening and compliance validation aligned to applicable baselines (e.g., STIG/SCAP profiles), including routine OpenSCAP-based scanning, remediation, and evidence generation to support audits and continuous monitoring.
  • Configure and tune monitoring and alerting, perform root-cause analysis, and support incident response by investigating system-level failure modes (performance, availability, security signals) and reducing false positives/negatives through tuning.
  • Integrate Linux hosts and services with enterprise identity and authentication systems (e.g., AD/LDAP, certificates, SSO), applying least privilege and access governance expectations to both users and services.
  • Support developers and modernization: provide direct platform support to development teams, including provisioning and maintaining Dev/Staging/Prod environments, enabling safer deployments, and improving run-time observability for both COTS and homegrown applications.
  • Modernize COTS and homegrown applications: lead or co-lead modernization initiatives such as standardizing runtime dependencies, improving configuration management for COTS, enabling repeatable builds, and adopting containers/Kubernetes where they are a good fit for the workload and operations model.
  • Own backup and recovery readiness for assigned Linux services by coordinating backups, validating restores, and participating in contingency and recovery exercises (including recovery point/time objective alignment where defined).
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service