Software Engineer/SRE- Application Middleware

Careers at Bloomberg

New York, NY

Posted Mar 8, 2018 - Requisition No. 65104

Our Team:

Bloomberg systems are known to be fast and reliable; we're the team that makes it possible. We build middleware - the software infrastructure designed to create large-scale, fault-tolerant applications that run on thousands of machines throughout the world. We’re two dozen programmers building a complex infrastructure using a variety of programming paradigms such as RPC, publish/subscribe and message queues. Our end users are engineers, who have different needs from each other, and we’re trusted to make architectural decisions that will scale across a wide range of use cases. With thousands of clients depending on our infrastructure solutions, we're looking to grow our SRE team. That’s where you come in.

What’s in it for you:

Our application frameworks run on tens of thousands of machines and are used every day by over 5,000 engineers, so your work will have an impact across the entire organization. You’ll be trusted to define the processes that will make the system as reliable, high volume, high performance, high throughput, low latency and scalable as possible, with self-healing characteristics. On any given day, you'll make decisions that impact some of the most critical systems at Bloomberg.

Our infrastructure is built to automate deployment and operations using Chef; developed and open sourced at https://github.com/bloomberg/chef-bcpc and https://github.com/bloomberg/nginx-cookbook. We’re passionate about automtion and collaborative development both within the company and with the wider open source community so we’re looking for like-minded individuals to join our team.

We’ll trust you to:

  • Review and influence the design and standards of the software
  • Respond to and resolve unexpected service problems, but more importantly write software to prevent the same problem happening again
  • Automate everything from deployment and configuration management to mitigation of outages, all aspects end-to-end
  • Manage system releases, write production software acceptance tests and
  • coordinate all aspects of the release including coverage and communication
    plans

  • Create dashboards and instrument the code to capture and publish essential
  • metrics, and use this data to define alerts
  • Build data analysis tools to keep track of important service level
  • indicators, predict future capacity needs, audit application configurations

You need to have:

  • 3+ years experience with Python or C++ (exposure to C++ is required)
  • A solid understanding of data structures, algorithms, complexity analysis
  • Good knowledge of Linux/Unix
  • Excellent problem solving skills
  • BA, BS, MS, PhD in Computer Science, Engineering or related technology field

We’d love to see:

  • Extensive exposure to working with fault tolerant approaches in a large scale
  • distributed environment and high performance systems

  • Good understanding of internet and networking protocols
  • Ability to handle periodic on-call duty as well as urgent requests
  • Experience with Git, CMake, Jenkins, DPKG, RPM, Docker, Chef, Terraform,
  • OpenStack, VMware, Grafana, time-series databases

Check out more about how we work and what it means to be an SRE at Bloomberg in our blog post: https://www.techatbloomberg.com/blog/bloomberg-bets-big-on-sres/

Similar jobs