SRE - Cloud Platform | New York, NY

Similar jobs

SRE - Cloud Platform

New York, NY

Posted Mar 21, 2018 - Requisition No. 66139

An SRE at Bloomberg is a hybrid of systems and software engineering who is trusted to improve the stability and availability of the production environment through automation. We're responsible for monitoring, provisioning, configuration management, orchestration, capacity planning, deployment and rollback, incident management, and systems development life cycle practices.

Our Team:

The Cloud Stability group is trusted to support Bloomberg's private cloud infrastructure. This infrastructure runs on our own open-source OpenStack distribution based on OpenStack itself, Ubuntu, Chef, Ansible and Ceph. It spans across Bloomberg's own world-class data centers and global private network, hosting business critical applications and services. You'll be trusted to ensure high-availability and scalability of this environment.

What's In It For You:

You'll work with modern open-source tooling while maintaining mission-critical systems hosting a wide array of applications. We'll depend on you to advise on design, architecture, and scaling of our virtual farms that utilize several different technologies with different SLAs. In addition you'll play a critical role in improving the stability of all cloud systems to help us ensure we have a solid platform as we scale.

You'll Need to Have:

Demonstrated experience programming and testing Python, Ruby, Go, or C/C++
Experience working in a 24/7 production engineering organization
Ability to listen, communicate, evaluate, problem solve, multi-task, and prioritize in a high-pressure, mission-critical, and rewarding team environment.
BA, BS, MS, PhD in Computer Science, Engineering or related technology field

We'd Love to See:

Deep expertise troubleshooting complex distributed systems
Experience with creating and improving documented procedures and/or playbooks
Working knowledge of Chef, Puppet, Ansible, or Salt
Familiarity with open source configuration, orchestration, and CI/CD tools
Deep understanding of TCP/IP and Unix networking, Linux kernel performance (virtual memory and process scheduling)
Experience with Virtualization technologies such as Docker or VMWare
Familiarity with large-scale x86 infrastructure with thousands of machines

Check out more about how we work and what it means to be an SRE at Bloomberg in our blog post: https://www.techatbloomberg.com/blog/bloomberg-bets-big-on-sres/