Articles about Machine Learning

Parallax Proves a High-Value Concept and Gains a Predictive Machine Learning Model by Collaborating with OmbuLabs

Parallax was beginning to explore the use of artificial intelligence (AI) or machine learning (ML) to leverage the wealth of data on hand about customer projects, with the goal of improving their resource planning. The company thought it might be possible to create a machine learning model that identifies customer projects at risk, equipping the Customer Success team to make data-driven recommendations on how to head off problems before they occur.

Read more »

Machine Learning: An Introduction to Gradient Boosting

Welcome to the third article in our Machine Learning with Ruby series!

In our previous article Machine Learning: An Introduction to CART Decision Trees in Ruby, we covered CART decision trees and built a simple tree of our own. We then looked into our first ensemble model technique, Random Forests, in Machine Learning: An Introduction to Random Forests. It is a good idea to review that article before diving into this one.

Random Forests are great for a wide variety of cases, but there are also situations where they don’t perform quite as well. In this article we’ll take a look at another popular tree-based ensemble model: Gradient Boosting.

Read more »

Machine Learning: An Introduction to Random Forests

In our previous article Machine Learning: An Introduction to CART Decision Trees in Ruby, we covered CART decision trees and built a simple tree of our own. Decision trees are very flexible and are a good tool for simple classification, but they are often not enough when it comes to real-world scenarios.

When dealing with large and complex data, or when dealing with data with a significant amount of noise, we need something more powerful. That’s where ensemble models come into play. Ensemble models combine a number of weak learners to build a strong model, with increased accuracy and robustness. Ensembles also help manage and reduce bias and overfitting.

In this article, we’ll cover a very popular tree-based ensemble model: Random Forest.

Read more »

Pecas: Machine Learning Problem Shaping and Algorithm Selection

In our previous article, Machine Learning Aided Time Tracking Review: A Business Case we introduced the business case behind Pecas, an internal tool designed to help us analyse and classify time tracking entries as valid or invalid.

This series will walk through the process of shaping the original problem as a machine learning problem and building the Pecas machine learning model and the Slackbot that makes its connection with Slack.

In this first article, we’ll talk through shaping the problem as a machine learning problem and gathering the data available to analyse and process.

Read more »

Machine Learning: An Introduction to CART Decision Trees in Ruby

In the middle of last year, we released an internal tool to help address a pretty significant issue. That is how the Pecas tool was born, and you can read about the Business Case for Pecas here.

Pecas relies on a binary classification machine learning model to classify time entries as valid or invalid. It is a combination of a Django app, that hosts the Slackbot and other data processing tasks, and a FastAPI app that hosts the machine learning model built using the Scikit-learn Python library. Scikit-learn provides a great set of classification models you can use, which are optimized and very robust, making it a solid choice to build your model. However, understanding the principles behind the classification can be a bit tricky, and machine learning models can feel a bit like a black box.

In this series, we’ll explore some principles of machine learning, namely binary classifiers, and walk through how they connect to each other, in Ruby. This article will focus on decision trees, namely CART (Classification And Regression Trees) and a little bit of the mathematics behind them.

Read more »

Machine Learning Aided Time Tracking Review: A Business Case

As an agency, our business model revolves around time. Our client activities rely on a dedicated number of hours per week worked on a project, and our internal activities follow the same pattern. As such, time tracking is a vital part of our work. Ensuring time is tracked correctly, and time entries meet a minimum quality standard, allows us to be more data-driven in our decisions, provide detailed invoices to our clients and better manage our own projects and initiatives.

Despite being a core activity, we had been having several issues with it not being completed or not being completed properly. A report we ran at the end of 2022 showed our time tracking issues were actually quite severe. We lost approximately one million dollars in 2022 due to time tracking issues that led to decisions made on poor data. It was imperative that we solved the problem.

To help with this issue, we created an evolution of our internal Pecas project. We turned Pecas into a machine learning powered application capable of alerting users of issues in their time entries. In this article, we’ll talk though the business case behind it and expected benefits to our company.

Read more »