SRE - Bloomberg Query Language (BQL) Team

Careers at Bloomberg

New York

Posted May 30, 2018 - Requisition No. 67342

Our Team:

The Bloomberg Query Language (BQL) team is pushing the envelope to lead the low-latency analytics space in the financial domain. We are developing a cloud-based, low-latency Analytics and Screening Platform for huge financial data sets. We are also creating a Financial Query Language to allow users to express complex data retrieval, analytics and screening for processing on the BQL Platform. Aside from analytics, the client query language can be used to express complex screening capabilities.

We're architecting and designing the entire ecosystem for the Analytics Platform and Screener Engine. The platform leverages various technologies to enable distributed and cross-language analytics. Being a federated platform, the challenges of maintaining the stability and reliability of the system are unique and enormous. That's where you come in.

As an SRE in BQL, you'll be trusted to ensure that our production systems are healthy, monitored, automated, and designed to scale. We'll depend on you to optimize the overall reliability of our clusters through stress tests, chaos engineering, failovers and auto-recovery. You'll develop tools focusing on continuous integration, automated software releases, configuration management and system management.

We'll trust you to:

  • Own, manage, monitor and optimize the reliability and overall health of our development and production environments
  • Work closely with development teams to define standards and ensure that applications are designed with scale, resilience, and performance in mind
  • Streamline software development with continuous integration, deployment automation and agile configuration management
  • Build tools to reduce toil and increase insight into trouble spots
  • Implement effective governance controls in our development lifecycle
  • Manage resiliency design & planning, collection and analysis of availability metrics
  • Monitor current capacity, conduct regular capacity testing and predict future capacity needs

You'll need to have:

  • 3+ years of experience in a relevant role (DevOps, Reliability Engineering, Software Development)
  • Strong knowledge of UNIX or Linux systems running distributed applications platforms
  • Demonstrated experience managing performance, availability and scale of mid- to large-sized systems
  • Hands on experience in production deployment and release management

We'd love to also see experience with:

  • Hands-on experience of Java programming language
  • JVM tuning and profiling
  • Containerization technologies (like Docker, Kubernetes, Mesos)
  • Configuration management tools (like Chef, Puppet, Ansible)
  • Continuous integration and deployment tools (like Jenkins, Bamboo, SonarQube)
  • Distributed caches (like Redis)
  • Distributed Computing knowledge

Check out more about how we work and what it means to be an SRE at Bloomberg in our blog post: https://www.techatbloomberg.com/blog/bloomberg-bets-big-on-sres/

Similar jobs