Software Engineer/SRE- Application Middleware
New York, NY
Posted Mar 8, 2018 - Requisition No. 65104
Our Team:
Bloomberg systems are known to be fast and reliable; we're the team that makes it possible. We build middleware - the software infrastructure designed to create large-scale, fault-tolerant applications that run on thousands of machines throughout the world. We’re two dozen programmers building a complex infrastructure using a variety of programming paradigms such as RPC, publish/subscribe and message queues. Our end users are engineers, who have different needs from each other, and we’re trusted to make architectural decisions that will scale across a wide range of use cases. With thousands of clients depending on our infrastructure solutions, we're looking to grow our SRE team. That’s where you come in.
What’s in it for you:
Our application frameworks run on tens of thousands of machines and are used every day by over 5,000 engineers, so your work will have an impact across the entire organization. You’ll be trusted to define the processes that will make the system as reliable, high volume, high performance, high throughput, low latency and scalable as possible, with self-healing characteristics. On any given day, you'll make decisions that impact some of the most critical systems at Bloomberg.
Our infrastructure is built to automate deployment and operations using Chef; developed and open sourced at https://github.com/bloomberg/chef-bcpc and https://github.com/bloomberg/nginx-cookbook. We’re passionate about automtion and collaborative development both within the company and with the wider open source community so we’re looking for like-minded individuals to join our team.
We’ll trust you to:
- Review and influence the design and standards of the software
- Respond to and resolve unexpected service problems, but more importantly write software to prevent the same problem happening again
- Automate everything from deployment and configuration management to mitigation of outages, all aspects end-to-end
- Manage system releases, write production software acceptance tests and
- Create dashboards and instrument the code to capture and publish essential metrics, and use this data to define alerts
- Build data analysis tools to keep track of important service level indicators, predict future capacity needs, audit application configurations
coordinate all aspects of the release including coverage and communication
plans
You need to have:
- 3+ years experience with Python or C++ (exposure to C++ is required)
- A solid understanding of data structures, algorithms, complexity analysis
- Good knowledge of Linux/Unix
- Excellent problem solving skills
- BA, BS, MS, PhD in Computer Science, Engineering or related technology field
We’d love to see:
- Extensive exposure to working with fault tolerant approaches in a large scale
- Good understanding of internet and networking protocols
- Ability to handle periodic on-call duty as well as urgent requests
- Experience with Git, CMake, Jenkins, DPKG, RPM, Docker, Chef, Terraform, OpenStack, VMware, Grafana, time-series databases
distributed environment and high performance systems
Check out more about how we work and what it means to be an SRE at Bloomberg in our blog post: https://www.techatbloomberg.com/blog/bloomberg-bets-big-on-sres/