Senior Data Science Platform Engineer

Similar jobs

New York, NY

Posted Feb 24, 2017 - Requisition No. 57197

Bloomberg runs on data. It's our business and our product. From the biggest banks to the most elite hedge funds, financial institutions need timely, accurate data to capture opportunities and evaluate risk in fast-moving markets. With petabytes of data, our data science team is at the forefront of innovation in our business. We transform large amounts of structured and unstructured data such as text, time series, and events into machine-readable knowledge fueling applications and consumer decisions. The platform which supports these efforts is critical to its success.

That's where you come in. Working in a talented multi-disciplinary team, you will be responsible for the research, development, and stability of our next generation Data Science platform. This role offers the ability to truly innovate and invent, helping define the technical foundations of this groundbreaking system. Built with containerization and modern container orchestration systems on top of cutting-edge hardware, including GPU's, you will help build a system that rivals super-computing platforms across the world.

Our team:

The Data Science Infrastructure team is a new team which was established to build a platform supporting development efforts around data-driven science, machine learning, and business analytics. This is a very young team, enabling you to make a large impact by bridging advanced infrastructure with the worlds of Machine Learning and Data Science.

What's in it for you:

You'll have the opportunity to make key technical decisions which will bring this platform into the future. You'll be able to apply your existing knowledge while gaining experience in the areas of orchestration, containerization, GPU's, and data science. You'll have the opportunity to contribute to solutions that support new functionality within the Bloomberg Terminal, a leading driver of financial decisions around the world.

How we give back:

This new team will make extensive use of open source software. As part of that, we make a commitment to upstreaming features we'll be developing. Whether pushing bug-fixes upstream, developing new features, giving presentations at conferences and meetups, or collaborating with industry leaders, open source will be at the heart of this team. It's not just something we do in our free time, it is how we work.

We’ll trust you to:

Interact with data scientists to understand their workflows and requirements
Design and deploy solutions for problems such as high availability, elastic load distribution, and high throughput
Automate operation, installation, and monitoring of data science ecosystem components in our infrastructure stack

You’ll need to be able to:

Troubleshoot and debug run-time issues
Provide developer and operational documentation
Provide performance analysis and capacity planning for clusters
Identify and perform OS and hardware-level optimizations
Be organized and multitask in a faced paced environment

You’ll need to have:

Experience programming in Python, Java, Scala, JavaScript, or Ruby
Linux systems administration experience (Bash, Network, Filesystems)
Experience with configuration management systems (Chef, Puppet, Ansible, or Salt)
Experience with continuous integration tools and technologies (Jenkins, Git, Chat-ops)

We’d love to see:

Experience building and scaling Docker based systems using Kubernetes, Swarm, Rancher, Mesos
Experience configuring, deploying, managing Apache Spark, and Hadoop HDFS
Experience working with authentication and authorization systems such as Kerberos and LDAP
Experience working with GPU compute software and hardware
Open source involvement such as a well-curated blog, accepted contribution, or community presence

If this sounds like you, apply! You can also learn more about our work using the links below: