Site Reliability Engineer - Unix Support - Director

Morgan Stanley•Edison, NJ

About The Position

In the Technology division, we leverage innovation to build the connections and capabilities that power our Firm, enabling our clients and colleagues to redefine markets and shape the future of our communities. This is a Lead Infrastructure Production Management & Reliability Engineering position at the Director level, which is part of the job family responsible for maintaining the stability and reliability of the organization's infrastructure systems, ensuring optimal performance and availability to support business operations. Morgan Stanley is an industry leader in financial services, known for mobilizing capital to help governments, corporations, institutions, and individuals around the world achieve their financial goals. Interested in joining a team that’s eager to create, innovate and make an impact on the world? Read on. Enterprise Technology Enterprise Technology & Services (ETS) delivers shared technology services for Morgan Stanley supporting all business applications and end users. ETS provides capabilities for all stages of Morgan Stanley's software development lifecycle, enabling productive coding, functional and integration testing, application releases, and ongoing monitoring and support for over 3,000 production applications. ETS also delivers all workplace technologies (desktop, mobile, voice, video, productivity, intranet/internet) in integrated configurations that boost the personal productivity of employees. Application and end user functions are delivered on a scalable, secure, and reliable infrastructure composed of seamlessly integrated datacenter, network, compute, cloud, storage, and database functions. Enterprise Computing (EC) The Enterprise Computing – Critical Infrastructure Support team is responsible for maintaining a diverse plant of Unix, Linux, DDI Infrastructure. This team serves as the highest level of escalation for supporting Level 1 Command Center as well as Level 2 Unix/Linux Administrators. The team is further responsible for developing procedures and tools that enable Morgan Stanley's distributed plant to scale effectively, by eliminating or parallelizing repetitive operational tasks. The group is tasked with the deployment of all new distributed and storage infrastructure including new datacenter build outs. Team members frequently interact with engineering teams and collaborate on the testing and certification for new hardware and software products. What you'll do in the role: This role is for a Senior Unix (L3) Specialist within the Enterprise Computing – Critical Infrastructure Support team. The position is responsible for providing Site Reliability Engineering (SRE)–focused operational support for the firm’s critical Unix and Linux infrastructure in a global, Follow‑the‑Sun operating model. Key responsibilities include on‑call support and participation in weekend project work as required to ensure continuous service availability. The role emphasizes proactive operational health, hygiene, and compliance activities to maintain system stability and to meet enterprise risk and control requirements, ensuring the integrity of production environments. The successful candidate will collaborate closely with engineering teams to test, certify, and introduce new hardware and software technologies into production In addition, the role involves direct engagement with internal clients and stakeholders to understand business requirements and to provide effective support during incident and outage scenarios. The individual will also be responsible for onboarding new Unix infrastructure to support business growth initiatives and large‑scale programs, including new datacentre build‑outs

Requirements

6+ years of enterprise‑scale Unix/Linux administration experience in large, distributed environments
Strong experience in incident, change, and problem management, with exposure to project coordination and operational reporting
Deep expertise in Red Hat Enterprise Linux administration and troubleshooting
Strong understanding of TCP/IP networking and associated technologies including NFS, NIS, DNS, and DHCP
Experience with Unix/Linux system monitoring, performance analysis, and operational reporting
Working knowledge of NTP and enterprise time synchronization technologies
Hands‑on experience with vulnerability assessment, remediation, and server patching in regulated environments
Administrative‑level scripting skills using Shell and one or more high‑level languages such as Perl or Python
Proven ability to troubleshoot complex infrastructure issues with an automation‑first mindset
Strong verbal and written communication skills, with the ability to operate effectively in a global, cross‑functional environment
Strong organizational skills with the ability to manage competing priorities and perform under pressure during critical incidents

Nice To Haves

Knowledge of Infoblox DDI solution
Certification in RedHat Enterprise Linux
Knowledge of Veritas Cluster Server or similar high availability product
Knowledge in ansible, OpenShift, Kubernetes
Veritas Volume Manager
Prior experience working in the financial services industry

Responsibilities

providing Site Reliability Engineering (SRE)–focused operational support for the firm’s critical Unix and Linux infrastructure in a global, Follow‑the‑Sun operating model
on‑call support and participation in weekend project work as required to ensure continuous service availability
proactive operational health, hygiene, and compliance activities to maintain system stability and to meet enterprise risk and control requirements, ensuring the integrity of production environments
collaborate closely with engineering teams to test, certify, and introduce new hardware and software technologies into production
direct engagement with internal clients and stakeholders to understand business requirements and to provide effective support during incident and outage scenarios
onboarding new Unix infrastructure to support business growth initiatives and large‑scale programs, including new datacentre build‑outs

Stand Out From the Crowd

Upload your resume and get instant feedback on how well it matches this job.

Upload and Match Resume