Senior Data Science Platform Engineer
New York, NY
Posted Jul 24, 2017 - Requisition No. 59651
Bloomberg runs on data. It's our business and our product. From the biggest banks to the most elite hedge funds, financial institutions need timely, accurate data to capture opportunities and evaluate risk in fast-moving markets. With petabytes of data, our data science team is at the forefront of innovation in our business. We transform large amounts of structured and unstructured data such as text, time series, and events into machine-readable knowledge fueling applications and consumer decisions. The platform which supports these efforts is critical to its success.
That's where you come in. Working in a talented multi-disciplinary team, you will be responsible for the research, development, and stability of our next generation Data Science platform. This role offers the ability to truly innovate and invent, helping define the technical foundations of this groundbreaking system. Built with containerization and modern container orchestration systems on top of cutting-edge hardware, including GPU's, you will help build a system that rivals super-computing platforms across the world.
The Data Science Infrastructure team is a new team which was established to build a platform supporting development efforts around data-driven science, machine learning, and business analytics. This is a very young team, enabling you to make a large impact by bridging advanced infrastructure with the worlds of Machine Learning and Data Science.
What's in it for you:
You'll have the opportunity to make key technical decisions which will bring this platform into the future. You'll be able to apply your existing knowledge while gaining experience in the areas of orchestration, containerization, GPU's, and data science. You'll have the opportunity to contribute to solutions that support new functionality within the Bloomberg Terminal, a leading driver of financial decisions around the world.
How we give back:
This new team will make extensive use of open source software. As part of that, we make a commitment to upstreaming features we'll be developing. Whether pushing bug-fixes upstream, developing new features, giving presentations at conferences and meetups, or collaborating with industry leaders, open source will be at the heart of this team. It's not just something we do in our free time, it is how we work.
We’ll trust you to:
- Interact with data scientists to understand their workflows and requirements
- Design and deploy solutions for problems such as high availability, elastic load distribution, and high throughput
- Automate operation, installation, and monitoring of data science ecosystem components in our infrastructure stack
You’ll need to be able to:
- Troubleshoot and debug run-time issues
- Provide developer and operational documentation
- Provide performance analysis and capacity planning for clusters
- Identify and perform OS and hardware-level optimizations
- Be organized and multitask in a faced paced environment
You’ll need to have:
- Linux systems administration experience (Bash, Network, Filesystems)
- Experience with configuration management systems (Chef, Puppet, Ansible, or Salt)
- Experience with continuous integration tools and technologies (Jenkins, Git, Chat-ops)
We’d love to see:
- Experience building and scaling Docker based systems using Kubernetes, Swarm, Rancher, Mesos
- Experience configuring, deploying, managing Apache Spark, and Hadoop HDFS
- Experience working with authentication and authorization systems such as Kerberos and LDAP
- Experience working with GPU compute software and hardware
- Open source involvement such as a well-curated blog, accepted contribution, or community presence
If this sounds like you, apply! You can also learn more about our work using the links below: