Distributed Systems Engineer - DataHub
New York, NY
Posted Oct 8, 2019 - Requisition No. 78878
The DataHub Engineering team provides a distributed platform for hosting datasets, complete with managed data stores, search, discovery, batch analytics and real-time stream processing capabilities. Our goals: ensure high quality content, which is indispensable to financial markets, is cataloged, standardized, discoverable, distributed and accessible in one place. And offer low latency, high availability data stores, pipes, real-time change data capture, distribution and discoverability services. That's where you come in.
Who are you:
The ideal candidate will have thrived in operating complex systems, diagnosing and resolving the hardest corner case problems. You are fast on your feet and excited by the challenge of working in a hyper-growth environment where priorities shift quickly. In addition to your Linux systems skills, you are very comfortable with networking & database technologies.
What's in it for you:
As a distributed systems engineer in DataHub team, you will build systems that scale and distribute referential data and drive Bloomberg’s applications and enterprise offerings. You will build and scale data pipelines for processing (filtering and querying) billions of messages a day. You will also prove out new technologies (Spark, Notebooks, HBase, Vitess) for data exploration and QC. This is an opportunity to operate and engineer systems on a massive scale, and to gain valuable experience in distributed computing. You'll be surrounded by people who are passionate about distributed computing, and believe that world-class service is critical to customer success. You'll get the chance to work with development teams across Bloomberg and understand their application requirements and build systems together.
You'll need to have:
- 4+ years of professional experience in Java/Scala/Golang
- Experience designing services that scale to millions of requests a second
- Proficiency in Database storage engines, Linux, kernel subsystems, TCP/IP, performance engineering
- Understanding of database internals (e.g. MySql, Innodb, PostgreSQL) and distributed databases (e.g. HBase, Vitess, YugaByte DB, Galera cluster replication, MySQL Group replication)
- Experience in software instrumentation for monitoring and observability
- BA, BS, MS, PhD in Computer Science, Engineering or related technology field
We'd love to see:
- Experience with automated testing, continuous integration, and documentation. Experience with containers and cluster managers is a plus, e.g. Docker, Kubernetes, Mesos