Senior Applied ML Scientist (f/m/d) // Hive at Hive

Posted on: 05/19/2022

Location: Berlin, Germany (ON-SITE)

full time

Glassdoor: 5.0 / 5 | Original Source

Tags: hubspot scikit sql github matplotlib aws numpy pandas python ml

**The position** The Hive Data Team We’re setting up a dedicated Data Team to help power our business and allow us to serve our customers better. Some examples of use cases that our Data Team will be tackling: * Automated demand forecasts to allow us to allocate resources correctly as we enter new countries across Europe. * Predicting delivery times to allow us to choose the best carrier for each parcel and give customers more certainty about when their order will arrive. * Helping business teams to make better decisions - for example by getting better insights about the performance and pricing of different delivery carriers. * Optimizing our operations by finding the optimal picking routes in a warehouse, the optimal placement of inventory, etc. We’re convinced that laying the right foundations now to use our data to its fullest potential will have big payoffs both for us as a company and for our customers in terms of lower prices and better service, and will therefore be an important differentiator for us in the market - and we want you to help us get there. **What you will be doing** You’ll be joining us as one of the first members of our Data Team. As an applied ML Scientist, this means you’ll be developing and productionizing models to predict relevant business metrics (e.g. delivery times or order volumes) in order to help us and our customers make good decisions. Some examples of what this means concretely: * Collaborating with people from our business teams to define problems and discuss solutions on a conceptual level - e.g. discussing our needs for product demand forecasting with someone from our Supply Chain team. * Determining (together with business stakeholders) which data would be useful to help us make good predictions and working with other teams to ensure that we start collecting this data. * Training statistical and Machine Learning models on our data and choosing the ones with the best performance (potentially with the help of AutoML tools). * Deploying your models on production and making the predictions accessible to our applications. * Determining useful metrics for assessing model performance together with business stakeholders and tracking and improving those metrics over time. **What you will not be doing (for now)** * Machine Learning research. This is an applied role, not a research role. We don’t care about developing novel techniques that are on the cutting edge of ML research, but rather about doing the simplest thing possible to create useful forecasts for our users. The challenging part will therefore be more about choosing the right approach conceptually, making sure we get the data we need, deploying the model in a reliable way, and thinking deeply about the needs of our models’ users to make sure that your work has the greatest possible impact on the business. **Our Stack** * BigQuery. We use Google BigQuery as our Data Warehouse. * DBT. We use DBT to perform transformations on data in BigQuery in a way that follow follows software engineering best practices like tracking changes in a version control system, modularity, CI/CD, and documentation. * Fivetran. We import data from Hubspot, Google Analytics and other sources into our data warehouse using Fivetran (so that we don’t need to re-invent the wheel by writing our own connectors). * Metabase. Metabase is our go-to tool for creating BI dashboards and queries that help our business make better decisions. * Modern tooling. Code is hosted on GitHub, Linear used for issue tracking, we have Notion as our internal wiki, and use Slack for internal communication. Our marketing websites are built with Webflow. **Your profile** * End-to-end experience bringing ML models (especially using time-series data) live, from conceptualisation/problem definition to training them and running inference in production + maintenance (and making this data available to be used in our application). * Fluency in Python and familiarity with its scientific stack such as numpy, pandas, scikit learn, matplotlib and you can write maintainable code that follows best practices in software engineering. * You can run the model you developed in production yourself (e.g. with AWS Sagemaker or AWS Forecast) * Before starting to work on a use-case, you carefully challenge existing assumptions and care a lot about clean data. * Experience with time-series analysis and forecasting * You are a excellent communicator and can explain the intuition behind your modelling approach to people with different backgrounds, as well as collaborating with people from our business teams to figure out the best way to frame our prediction problems and which data we have available to solve them. * This is a hands-on role, not a pure research role: That means you’ll also need to get your hands dirty cleaning and transforming data. **Bonus Points** * Strong SQL Skills * Experience with ML Engineering (i.e. running ML models in production) * Experience with AWS and Google cloud services (our database and apps are currently hosted on AWS, our data warehouse on Google BigQuery) * Experience managing and improving data observability and quality (potentially using tools like Monte Carlo) * Experience working with orchestrators like Airflow **Our offering** * Join the Hive. Be part of a team of highly motivated and talented people who enjoy what they do and care about doing excellent work, while also having each others' backs when the sh*t hits the fan. * Work from anywhere. The tech team at Hive is remote-first, meaning that we hold almost all meetings via video-call. While many of us enjoy coming to the office to whiteboard or have a beer/club mate together, our remote-first setup allows us to work together with great people both inside and outside Berlin (as long as the timezone is within +/- 3 hours of Berlin). * You'll be valued. We are a tech-driven company, and our software is a large part of what sets us apart from the competition. Your contribution will be rewarded accordingly, as we offer attractive compensation packages (salary + equity). * Take ownership and work close to the business. At Hive, you'll be collaborating closely with teammates from operations or customer experience to discover together how best to solve our (and most importantly, our customers') problems with code. * Enjoy a high-quality developer experience. We recognise that keeping tech debt at sustainably low levels (by writing readable, maintainable, well-tested code) and enabling engineers to use their time effectively (by having user-friendly and fast CI and deployment pipelines, as well as proper monitoring and error logging) is an investment in our productivity and happiness. * We'll get you set up. New tech equipment that you need, IDEs, mechanical keyboards, screens, you name it — we want you to do your best work. * There's more! Enjoy flexible working hours, free drinks in one of our offices across Europe, and join regular (virtual or in-person) team events. Apply for this position **About us** We provide logistics made for tomorrow. Better e-commerce operations — fast, transparent, and affordable. Hive’s mission is to digitize logistics for e-commerce merchants, enabling businesses to run their operations on autopilot. Hive is the operational brain for D2C brands, taking on order fulfillment, shipping, returns, and more for a wide range of companies and product categories. We help enable fast-growing brands such as Yepoda, Ela Mo, and Holy Energy to grow. Backed by acclaimed global investors including EarlyBird, TigerGlobal, Picus Capital, and Activant, we are rapidly expanding our pan-European network with our newly opened fulfillment center and office in Paris, and Milan, Madrid, and London in the pipeline. Hive is logistics for the future. We question the conventional way of doing things and aim to push boundaries in all areas. Hive focuses on changing the way e-commerce fulfillment is done and strives toward creating the most advanced and sustainable solutions in the logistics field, enabling our customers to keep growing while keeping them close to the customer. At Hive, we believe fostering diversity in every team makes us a stronger company overall. We do not discriminate based on religion, skin color, nationality, gender, sexual orientation, age, marital status, or disability, and encourage applications from all backgrounds. We strive to create an inclusive workplace where everyone feels encouraged to be their true self and to grow professionally.