We’ve waxed lyrical about the benefits of hackathons on many occasions – testing theories within a collaborative sprint – learning new things whilst trying to apply them to a real-world situation in the safety of a non-customer facing environment. Last week our client, Sorted brought six of its smartest coders and business analysts to Godel HQ for a day of hacking. We were joined by Robin Lester, Microsoft CSA and Data Scientist who gave us a detailed tour of the Azure Machine Learning Studio. The team had the opportunity to learn about its features, including automated algorithm selection and parameter tuning.
Here’s what we did.
Sorted is a SaaS company with API first technology running online checkouts, warehouses and shipping using services from thousands of carrier companies. They came up with a practical problem that had a lot of relevance to them – they wanted to predict the number of API calls they’d receive – and therefore the load on their IT systems– based on a number of variables such as:
• The date range of shipments
• Number of consignments
• Number of packages
• Integration method (i.e. API-based vs file transfer)
• Logistics carrier, out of the ones that Sorted integrates with carriers
The hackathon team worked using Azure Notebooks (https://notebooks.azure.com/), Python 3.6 programming language, plus a number of well-known machine learning and data analysis libraries (Numpy, Pandas, Scikit-Learn, Matplotlib, Seaborn). They successfully managed to complete the following tasks during the course of the day:
1. Loaded the dataset from a CSV file
2. Performed basic exploratory data analysis and manipulation:
· Plotting of features to explore the ones that seemed related to one another
· One-hot encoding of categorical values
3. Removed features that either were collinear or were deemed irrelevant by the team based on expert domain knowledge from Developers and Business Analysts.
4. Split the dataset between training and validation sets
5. Trained several regression models, using a number of algorithms which included:
· Multi-linear least squares
· Random forest
· Gradient boosting
· LightGBM
· Multi-layer perceptron (neural network)
6. Evaluated the performance of these models using a scoring metric (coefficient of determination, aka R2)
7. Persisted the best performing model to a file using the Python “pickle” method
8. Retrieved the persisted model, loading it from the “pickled” file and using it to make predictions on demand
9. Learned how the trained model could be operationalised into a containerised RESTful service
Sorted gave the day a glowing recommendation. Martin Mayer, Offshore Development Manager from Sorted said:
“My team and I have been interested in Machine Learning for some time. We’ve been able to learn the basics on our own, but we were unsure what to learn next.
“In contrast, the day with Jorge and Robin gave us lots of knowledge to hit the ground running, with several different approaches. We’re really grateful for the tutorials around Microsoft Azure Machine Learning Studio and Python with Microsoft Azure Notebooks. Jorge and Robin were both really interested in the more advanced problems we’d like to take on: for example, Robin suggested which Azure solutions we could look at for image anomaly detection. Our business analysts joined us for part of the day – it was great to see that there are options to apply models without code, using tools like ML Studio.
“On top of this, we were made very welcome by Godel. Its modern offices provided a relaxed space for learning – plus, the team provided a great lunch for us. Many thanks to everyone involved.
“We plan to keep a weekly session going here at Sorted, building on what we learned. We’ll keep in contact and be sure to share what wizardry we create with this technology.”
If you’re interested in how you can utilise Machine Learning in your business contact us on hello@godeltech.com