SRE – Multi Asset Risk Systems (MARS)
New York, NY
Posted Jul 16, 2020 - Requisition No. 83925
We are redefining our data model and computation infrastructure to help answer various client questions and provide them with richer views and analysis of our data. Scope of our work involves building performant backend systems that can do macro and micro level investor and company flow analysis, and building appealing workflow solutions for our clients to visualize and understand the information to make informed investment decisions. We are looking for a developer who will join our highly upbeat and ambitious team. Someone who is willing to learn our full technology stack, diverse financial domain, and will work together to build the next generation of investor and company statistics engine, calculations, analytics, and visualization functionalities.
Our platform provides risk calculations and analytics across various asset classes by applying distributed computing techniques that span hundreds of machines. In addition to expanding our main product offering, our main concern is the performance which requires using different methods and technologies to achieve the optimal solution. We use big data software, distributed computing algorithms, dynamic resource allocations, and cluster management among others.
What's in it for you:
You will join a close-knit and growing group of 120 engineers. We develop best practices, tools, and processes that have a direct impact on how application developers at Bloomberg deploy and manage their applications. System Engineers will trust you as an escalation point and you'll regularly collaborate with them to maintain the stability and performance of operating systems and servers. We'll depend on you to guide the direction of the design, architecture, utilization of enterprise-class configuration, and orchestration systems.
We'll trust you to:
- Own, manage, monitor and optimize the reliability and overall health of our development and production environments
- Work closely with development teams to define standards and ensure that applications are designed with scale, resilience, and performance in mind
- Streamline software development with continuous integration, deployment automation, and agile configuration management
- Configure newly allocated cluster/machine capacities, in addition to streamlining and automating the quality control pipeline
- Build tools to reduce toil and increase insight into trouble spots
- Implement effective governance controls in our development lifecycle
- Manage resiliency design & planning, collection, and analysis of availability metrics
- Monitor current capacity, conduct regular capacity testing and predict future capacity needs
You'll need to have:
- 3+ years of experience in Python
- 3+ years of experience in a relevant role (DevOps, Reliability Engineering, Software Development)
- Strong knowledge of Linux systems running distributed applications platforms
- Demonstrated experience in managing performance, availability, and scale of mid- to large-sized systems
- Hands-on experience in production deployment and release management
- Energy, self-motivation, and independence to balance multiple tasks and work in a global environment
We'd love to see:
- Continuous integration and deployment tools (like Jenkins, Bamboo, SonarQube)
- Configuration management tools (like Chef, Puppet, Ansible)
- Database and NoSQL products like Cassandra, PostgreSQL
- Experience with automated testing tools
- Strong code review skills
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.