AI Group - Senior MLOps Engineer
New York, NY
Posted Aug 15, 2022 - Requisition No. 105601
The AI Group is the central engineering group responsible for driving Machine Learning (ML) adoption at Bloomberg, with 200 researchers and engineers working together to provide clients with the best-in-class news, research, market data, and analytics using state of the art machine learning technology We directly impact a wide variety of Bloomberg’s flagship products, including news, research, pricing, our communications platforms and search and discovery tools. We work on a variety of ML disciplines, including natural language processing, information retrieval, time series analysis, and recommender systems.
What We Do
What sets our group apart is end-to-end ownership of our models and services, which are distributed, high-throughput and low latency systems that are collectively called billions of times a day. In order to deliver at such scale, we are building platforms that enable our application-focused ML engineering teams to go from an idea to a model to a scalable service with minimal overhead. We also offer higher-level abstractions and UIs to enable domain experts to easily build, deploy and maintain production ML models for their applications in a self-service manner, with little engineering intervention.
What We Need From You
While working on the team as an MLOps Engineer, you will have the opportunity to enhance our platforms to streamline the productionization of ML models. You will work with both application and platforms teams to create a more cohesive, integrated, and managed model development life cycle. Typical activities include:
- Architecting, building, and diagnosing production ML systems
- Working closely with ML application teams to design seamless workflows for continuous model training, inference, and monitoring
- Defining and providing strong SLAs around latency, throughput and resource (memory / disk / network / CPU / GPU) usage
- Interfacing with both ML experts and platform engineers to understand workflows, pinpoint and resolve inefficiencies, and inform the next set of features for the platforms
- Collaborating with open-source communities and internal platform teams to build a cohesive MLOps experience
- Troubleshooting and debugging user issues
- Providing operational and user-facing documentation
Colleagues who excel in this role often exhibit these qualities:
- Curiosity to solve new problems and keep learning new technologies
- Passion for the engineering behind machine learning, and scaling it
- Industry experience with machine learning teams
- Working knowledge of common ML frameworks such as PyTorch, TensorFlow, scikit-learn, ONNX, etc.
- Prior experience with container technologies like Docker, Kubernetes, Buildpacks, etc.
- Experience with cloud providers such as AWS, GCP or Azure
- Willingness to collaborate with colleagues to achieve repeatable high quality outcomes as a team.
Our Open-Source Commitment
Our team makes extensive use of open source and is deeply involved in a number of communities such as PyTorch, PyTorch Lightning, Hugging Face, Solr, Kubernetes, Kubeflow, KServe, Kafka, Spark, Argo, Buildpacks, and other cloud-native MLOps technologies. From technical governance to upstream collaboration, we are committed to enhancing the impact and sustainability of open source.
In this role, you’ll be expected to interact with global open source project teams and communities. If you have a desire to use, develop, and lead open source software projects, we encourage you to apply. To learn more about our activities in the open source community, head over to our Tech at Bloomberg site.
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.