Lead Soft Prod Mgmt & RE : Job Level - Director

Morgan StanleyNew York, NY
1d$120,000 - $165,000Hybrid

About The Position

As a member of the RELIABILITY & PROD ENGINEERING team we need you to bring your passion for technology to enable the team to operate more efficiently in a fast-paced environment, and to help us provide best in class services to the firm's clients across Business Units, focusing on improving production management, system service availability, observability, scalability, performance, and resilience by applying sound monitoring, software/reliability engineering principles and adopting the latest technology and tooling. The RPE role is required to provide production support services under RPE organization. The role as well requires the member to develop automation and tooling to support SRE activities and achieve specific reliability and supportability goals (reduction of toil, monitoring and alerting efficiency etc.), for in-scope systems and across the larger org.

Requirements

  • bachelor's degree in computer science or related field
  • Proficiency with Linux
  • Strong experience in Database scripting (stored procedure and compound SQL) and data analysis in PostgreSQL, DB2, Snowflake or MongoDB etc.; DB monitoring and performance tuning
  • Working experience in Python/Shell scripting
  • Troubleshooting skills (tracking trends, producing metrics and analysis)
  • Strong verbal and written skills required to interact with global teams and customers
  • Flexibility of work in shift and perform on-call responsibility
  • Working from office (3 days per week minimum is the current policy).

Nice To Haves

  • Experience in financial service/products, investment banking
  • Experience in Advanced Monitoring/Alerting Tools (Grafana, Loki, Prometheus, Elastic Search etc.)
  • Have knowledge on development tools like Git/GitHub, Jenkins etc.
  • Agile/DevOps/SRE mindset and/or tooling
  • Understanding Cloud technology

Responsibilities

  • Working closely with engineering/development teams to design, build, and maintain systems.
  • Troubleshooting issues across the entire technology stack: hardware, software, application, and network.
  • Identifying and driving opportunities to improve automation for our platforms; scope and create automation for deployment, management, and visibility of our services.
  • Proactively identifying and addressing systems reliability risks.
  • Working alongside existing global and regional team members on a follow-the-sun basis.
  • Represent the RPE organization in design reviews and operational readiness exercises for new and existing services.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service