SRE - Storage Systems

Careers at Bloomberg

New York

Posted Mar 21, 2018 - Requisition No. 66134

A Service Reliability Engineer (SRE) at Bloomberg is a hybrid of systems and software engineering who is trusted to improve the stability and availability of the production environment through automation. They are responsible for Monitoring, Provisioning / Configuration / Orchestration, Capacity Management, Deployment and Rollback, Incident Management, and SDLC practices.

The Storage team is responsible for the automated provisioning and monitoring of our backup, SAN, NFS and other storage technologies. The team strives to automate all aspects of the storage lifecycle including provisioning, mobility, change and decommissioning. Our current projects include a Storage as a Service platform with full API support, and planning an evolution towards Software Defined Storage.

What's In It For You?

You'll work with modern open-source tooling while maintaining mission-critical systems hosting a wide array of applications. We'll depend on you to advise on design, architecture, and scaling of Storage platform that spans several technologies and you'll play a critical role in improving the stability of the Storage platform as we grown and add new storage services to the platform.

You'll Need to Have:

  • Demonstrated experience programming and testing Python, Ruby, Go, or C/C++
  • Experience working in a 24/7 production engineering organization
  • Ability to listen, communicate, evaluate, problem solve, multi-task, and prioritize in a high-pressure, mission-critical, and rewarding team environment.

We'd Love to See:

  • Deep expertise troubleshooting complex distributed systems
  • Experience with creating and improving documented procedures and/or playbooks
  • Working knowledge of Chef, Puppet, Ansible, or Salt
  • Familiarity with open source configuration, orchestration, and CI/CD tools
  • Deep understanding of TCP/IP and Unix networking, Linux kernel performance
  • Familiarity with large-scale x86 infrastructure with thousands of machines
  • Familiarity with EMC, HP, Pure, Isilon, Netapp, 3-Par
  • Familiarity with large fibre channel networks (Brocade)
Similar jobs