SRE - Trading Solutions Platform

Careers at Bloomberg

Back to Search

New York, NY

Posted Sep 26, 2018 - Requisition No. 70954

Who we are:

We are the Software Reliability Engineers who build and support the platform used to host the Bloomberg Trading Solutions Products. These include award-winning[1] and most-used[2] Buy-Side and Sell-Side Order Management Solutions (OMS) for the financial industry; used to buy and sell almost any type of security. Our OMSs handle millions of trades per day and are leveraged to manage assets worth trillions of dollars. We engineer the platform for these products and ensure that it delivers a stable and performant environment. We do this by concentrating on four basic principles: Availability, scalability, reliability and visibility. We employ extensive telemetry for hardware and software monitoring, and design the system with large isolation zones to limit the exposure of potential bugs or failures.

We'll trust you to:

  • Evaluate emerging technologies and techniques both within Bloomberg and in the industry through conference attendance, meetup participation and continued education. Then extend and refactor our platform with innovative designs
  • Work with other engineering, support and business teams to drive capacity planning, performance analysis, instrumentation and other non-functional systems requirements
  • Conduct pre-production reviews and add value to the design of TS applications from the perspective of application lifecycle management, scalability, fault tolerance and capacity management
  • Participate in all aspects of our agile-based development process, from requirements acquisition and system design to automated testing, packaging and deployment of our team’s software
  • Identify repeatable tasks and automate them
  • Take part in On-Call rotation for escalation from Operations team for out-of-office support, if our automated recovery or detailed runbooks are not sufficient

Who we are NOT:

  • 24/7 Operations staff -- there is a separate dedicated operations team for TS. We strive for our platform to require minimal intervention and to automate any recurring issues that require intervention
  • System Administrators
  • Release Engineers for software developed by others. Though TS has 100s of engineers that leverage our platform, they each handle the deployment and maintenance of their own software

You need to have:

  • 3+ years of experience developing, deploying and debugging distributed systems in a UNIX (Linux, Solaris/AIX) environment
  • 3+ years of experience programming in C/C++ or Java
  • Experience implementing solutions with standard scripting languages (Python or Perl)
  • A strong knowledge of operating system and networking fundamentals - including IPC mechanisms, multithreading and memory management
  • Proficiency using standard UNIX command line tools and shell environments (ksh or bash)
  • Good understanding of data structures and algorithms
  • Basic SQL knowledge to query various databases as needed
  • A desire to tackle large-scale systems design, troubleshooting and the implementation of global mission-critical services
  • Ability to handle periodic on-call duties and dealing with urgent requests

We’d love to see:

  • Experience with monitoring tools such as Graffana or Splunk, and analyzing metrics and performance data. Interpreting and correlating it to SLOs and SLAs
  • Experience with configuration and orchestration system such as Chef, Ansible, Puppet or Salt
  • Experience creating guidelines and policies
  • Working knowledge of Git, Test Driven Development, Continuous Integration and Continuous Development
  • Experience with containers for local development and/or for deploying to production
  • An interest in how the financial markets will be a plus

Check out more about how we work and what it means to be an SRE at Bloomberg in our blog post:

[1] TOMS OMS wins best sell-side OMS N-years in a row.
[2] AIM OMS:
[3] Our team is featured here:

Similar jobs