Senior Software Engineer- Serverless Infrastructure for Data Science

Similar jobs

New York, NY

Posted Nov 10, 2022 - Requisition No. 108998

The Team:

Bloomberg’s Data Science Platform was established to support development efforts around data-driven science, machine learning, and business analytics on Bloomberg's many datasets!

The solution was developed to provide a standard set of tooling for the Model Development Life Cycle (MDLC) which includes tools for the early stages of development and data exploration, to experimentation and large scale training, all the way to live inference. Through access to scalable compute and specialized hardware, Data Science Platform users have access to ML training jobs and Inference Services, Analytics and ETL using Spark, and data exploration with Jupyter. The solution is built on Kubernetes using containerization, container orchestration and a cloud architecture using 100% open source foundations.

What we do:

Our Inference solution is powered by the open source project KServe, a highly scalable and standards based model inference platform for trusted AI. As the founding members of KServe, we regularly upstream features we develop, present at conferences and collaborate with our peers in the industry, and are in tune with the surrounding Kubernetes community.

Delivering performance to latency-sensitive, throughput-heavy, model-driven applications, means making the right choices from hardware to Ingress. As a member of the Data Science Platform’s Infrastructure team with a focus on the Serverless Components, you’ll have the opportunity to work on open source serverless technologies underlying KServe, such as Knative and Istio, as well as looking at the latest hardware available on the market to service hundreds to thousands of models in a scalable way.

We’ll trust you to:

Innovate and design solutions that keep in mind strict production SLA: low latency/high throughput, multi-tenancy, high availability, reliability across clusters/data centers, etc.
Interact with ML experts to understand workflows, pinpoint and resolve inefficiencies, and inform the next set of features for the solution.
Collaborate with open-source communities and internal platform teams to build cohesive model deployment experience.
Automate operations and improve observability of the solution by integrating with systems for metrics and distributed tracing.
Fix and optimize ML model inference performance.
Build tools enabling other engineers a way to debug and understand performance of complicated systems.

You'll need to have:

4+ years of programming experience with an object-oriented programming language (Go, Python, C++, or JavaScript).
A degree in Computer Science, Engineering or similar field of study or equivalent work experience.
Experience designing and implementing low-latency, high-scalability systems.
Experience working in a multi-tenancy and multi-cluster environment.
Experience with distributed systems eg. Kubernetes, Kafka, Zookeeper/Etcd, Spark.
Experience with debugging performance issues with distributed tracing and benchmark tools.

We’d love to see:

Experience with serverless framework or infrastructure, such as Knative, AWS Lambda, Google CloudRun.
Experience working with Service Mesh, authentication & authorization systems like Spire/Spiffe.
Knowledge of GPU compute software and hardware.
Ability to identify and perform OS and hardware-level optimizations.
Experience with ML infrastructure open source project such as Kubeflow, KServe, MLFlow, Feast.
Experience with cloud providers such as AWS, GCP or Azure.
Experience with configuration management systems (Chef, Puppet, Ansible, or Salt).
Experience with continuous integration tools and technologies (Jenkins, Git, Chat-ops).

You can also learn more about our work using the links below:

Machine Learning the Kubernetes Way - https://www.youtube.com/watch?v=ncED2EMcxZ8
ML at Bloomberg -

https://on-demand-gtc.gputechconf.com/gtcnew/sessionview.php?sessionName=s9810-machine+learning+%40+bloomberg%3a+building+on+kubernetes

Inference with KServe(Formally KFServing) - https://www.youtube.com/watch?v=saMkA4fIOH8
Exploring model serving with KServe - https://www.youtube.com/watch?v=FX6naJLaq2Y
The journey to build Bloomberg ML Inference platform- https://www.bloomberg.com/company/stories/the-journey-to-build-bloombergs-ml-inference-platform-using-kserve-formerly-kfserving/

We believe interviewing is a two way street. It's a way for us to get to know you and your skills, and also a way for you to learn more about the team, our technical challenges, and what you'd be working on. The content of each interview round will be tailored to the role and your background, but the general framework can be found here: https://www.bloomberg.com/careers/technology/engineering/software-engineering-experienced-hire

We want to ensure you can put your best foot forward throughout the process, so if you have any questions or need any accommodations to be successful, please let us know!

We have a lot of opportunities to choose from in Engineering, and it is important to us that your skills and experience aligns best with the team you are interviewing with. To help ensure you are placed on the right team, your application will be considered for all of our current vacancies in Engineering at the first stage of the interview process.

Bloomberg is an equal opportunity employer, and we value diversity at our company. We do not discriminate on the basis of age, ancestry, color, gender identity or expression, genetic predisposition or carrier status, marital status, national or ethnic origin, race, religion or belief, sex, sexual orientation, sexual and other reproductive health decisions, parental or caring status, physical or mental disability, pregnancy or parental leave, protected veteran status, status as a victim of domestic violence, or any other classification protected by applicable law.

Bloomberg is a disability inclusive employer. Please let us know if you require any reasonable adjustments to be made for the recruitment process. If you would prefer to discuss this confidentially, please email amer_recruit@bloomberg.net

Salary Range: 160,000 - 240,000 USD Annually + Benefits + Bonus

The referenced salary range is based on the Company's good faith belief at the time of posting. Actual compensation may vary based on factors such as geographic location, work experience, market conditions, education/training and skill level.

We offer one of the most comprehensive and generous benefits plans available and offer a range of total rewards that may include merit increases, incentive compensation [Exempt roles only], paid holidays, paid time off, medical, dental, vision, short and long term disability benefits, 401(k) +match, life insurance, and various wellness programs, among others. The Company does not provide benefits directly to contingent workers/contractors and interns.