SRE - Kubernetes for Advanced Compute | New York, NY

Similar jobs

SRE - Kubernetes for Advanced Compute

New York, NY

Posted Jul 20, 2018 - Requisition No. 68291

Bloomberg runs on data. It's our business and it's our product. It's why thousands of companies partner with us. We've surpassed petabytes of data, with no end in sight.

Our Team:

We provide the infrastructure that supports core data services including Bloomberg’s data science and machine learning platform, search infrastructure, and various others. Our challenges span both software and hardware and the scale we work on is massive.

Our team is trusted to build the systems that run our newest cutting edge platforms. We’re also depended upon to manage the configuration, deployment, and operation of the systems that power the data backend of Bloomberg. On our team, you’ll have the ability to truly innovate and invent, helping define the technical foundations of groundbreaking systems. Built with containerization and kubernetes on top of leading-edge hardware, including GPU's and DNN-specific hardware, we’ve built systems that rival super-computing platforms across the world.

We’ll trust you to:

Design, build, and automate new solutions centered around the Kubernetes container orchestration platform and its ecosystem of projects
Be responsible for solutions which maintain configuration and robustness of systems
Analyze performance, metric placement and interpretation, and capacity planning
Troubleshoot and debug runtime issues with software and hardware
Do OS and hardware level optimizations
Interact with platform developers to understand and validate their workflows, requirements, application performance, and application resilience

What’s in it for you:

An opportunity to make key technical decisions which help define the future of data and analytics infrastructure platforms
The chance to apply your existing experience while gaining cutting edge new experience in kubernetes, containerization, GPU's, Data Science, and distributed database systems
Your solutions will drive new functionality within the Bloomberg Terminal and other client interfaces -- direct drivers of financial decisions around the world

You need to have:

2+ years systems configuration and automation experience (e.g. Ansible, Chef, Puppet, SaltStack -- error handling, idempotency, configuration management)
2+ years Linux systems experience (Ubuntu, Debian experience preferred, ideally conversant in Unix networking and C system calls)
Proven experience in a programming and/or scripting language (e.g. python, go, java, ruby)
A strong familiarity with Continuous Integration and Continuous Deployment methodologies, chat-ops, etc.
Proven experience building and scaling out mission-critical, elastic load distributed, and high throughput systems
BA, BS, MS, PhD in Computer Science, Engineering or related technology field

We’d love to see:

Experience with networking is a plus (e.g. packet analysis, routing protocols).
Open source experience is a plus (a well curated blog, upstream accepted contribution or community presence)

How we give back:

This new team will make extensive use of Open Source Software. As part of that, we make a commitment to upstreaming features we'll be developing within Kubernetes and its ecosystem. Whether pushing bug-fixes upstream, developing new features, giving presentations at conferences/meetups or collaborating with industry leaders open source will be at the heart of this new team. It's not just something we do in our free time -- it is how we work.

Check out more about how we work and what it means to be an SRE at Bloomberg in our blog post: https://www.techatbloomberg.com/blog/bloomberg-bets-big-on-sres/