Dive in and do the best work of your career at DigitalOcean. Journey alongside a strong community of top talent who are relentless in their drive to build the simplest scalable cloud. If you have a growth mindset, naturally like to think big and bold, and are energized by the fast-paced environment of a true industry disruptor, you’ll find your place here. We value winning together—while learning, having fun, and making a profound difference for the dreamers and builders in the world. We want people who are passionate about designing and operating secure systems at scale We are looking for an experienced, motivated, adaptable, empathetic automation-focused infrastructure engineer who is comfortable working remotely. You will report to the Engineering Manager of the Foresight team, with a primary mission of “Help deliver GPU systems rapidly”. You will architect, build, support, and scale the team’s Provisioning Automation system. This system will be used to quickly and reliably provision hardware at DigitalOcean. This position will involve a high degree of leadership, ownership and autonomy - we will need to release this system quickly, and be prepared to scale it 10x. This is a fast-paced role with a lot of opportunity. In addition to our primary mission, we have many other responsibilities. For example: We develop and support several infrastructure systems built in golang We develop and maintain several fleet visualization utilities, written in golang and react We write small system utilities or daemons that run on physical hosts, and report metrics about them We expose metrics to leadership for intelligent decisionmaking, including system firmware versions, provisioning success rates, etc. We help operational teams meet deadlines by keeping them informed of project progress Our team has a big scope, but don’t let it deter you - we’re a group of kind folks. More than anything, we’re looking for someone empathetic, motivated, and driven to grow with us. Also, we’re looking to expand our team’s StackStorm expertise. If you have StackStorm experience, that’s a bonus! DigitalOcean’s Internal Culture and Tooling: DigitalOcean teams communicate primarily via Slack. Foresight makes light use of Jira and GSuite. We strive to make our work-life balance comfortable, and aim to scope high-impact work appropriately so that everyone works at a healthy pace. You can expect to be on-call periodically once you are ready, but shouldn’t expect to be paged often. DigitalOcean’s observability platform comprises VictoriaMetrics, Grafana, Alertmanager, and Elasticsearch. Knowing any of these tools is a bonus, because every service at DO is generally expected to use this platform. The Foresight team is an arm of the Hardware Lifecycle Engineering (HLE) organization. We are aimed at boosting productivity by enabling our engineers to rapidly and reliably deploy hardware in various configurations, managing the lifecycle from standup to decommission. The HLE group is made up of a diverse group of roughly 14 engineers located across the US, Canada, and Europe. The Foresight team accounts for approximately 30% of the HLE group. Within Foresight, there are growth opportunities along several tracks (i.e. Tech Leader, Subject Matter Expert (SME), Project Management, Engineering Manager, etc). What You’ll Be Doing: As an engineer, you will spend your day-to-day on: Developing impactful, new and innovative systems that will help DigitalOcean scale Responding to provisioning failures Working to ensure that common provisioning failures do not recur (likely via automation) Collaborating with sibling teams to deliver on wider organizational goals Bringing new and actionable information to light via developing visualization tooling Having fun with an amazing and welcoming team 🙂 Here are some things we’ve spent our time on in the past few months: Developed a “provisioning-specific view” in our visualization interface Architected the MVP of an automated provisioning system Manually provisioned 300+ systems in order to meet aggressive deadlines (we’re not above manual work to hit our goals and feel the pain of our customers!) Developed firmware alerts for hardware system firmware being out-of-date What We’ll Expect From You: NB: If you don’t meet all of the expectations below, that’s okay! Submit an application, and be sure to include a cover letter telling us why you’d be a good fit for our team.
Stand Out From the Crowd
Upload your resume and get instant feedback on how well it matches this job.
Job Type
Full-time
Career Level
Mid Level
Education Level
No Education Listed
Number of Employees
1,001-5,000 employees