In 2017, Facebook released Prophet, their internal forecasting tool, as an open-source project. Prophet is designed for business forecasting where non-linear trends with yearly, weekly, and daily seasonality are common. Prophet also uses holidays and other user-defined events to improve forecast accuracy.
At the same time, the folks at Fast.ai developed software for time series data that uses neural networks with time-series data with the date split into multiple categorical variables. The software achieved state-of-the-art results in many machine learning competitions.
A key takeaway from these approaches to forecasting is that dates on their own aren't beneficial. It's the relevant features of the dates that are needed to create accurate forecasts.
Prophet and Fast.ai are great forecasting tools, but you can't use them if you aren't a python programmer. In this post, I'll show you how we can achieve similar results using Excel with the Prediction Laboratory add-in.
We'll start with the most basic historical data – just the date and the revenue. Our dataset has two years of data, which is the minimum amount suggested by the team that built Prophet.
Next, add a column for holidays and events. These "Events" can be holidays, marketing promotions, advertising campaigns, or any other event that will have a significant impact on revenue for that date. At Prediction Laboratory, we work with several businesses in the hospitality industry to produce revenue forecasts. These businesses offer several marketing promotions each month, and they have many seasonal trends. The dataset I'm using is a synthetic version that reflects this type of business.
The next step is to extract the relevant features from the dates. I'll use Excel's formulas WEEKDAY, MONTH, and YEAR to add three new columns to the dataset. The WEEKDAY function will help discover the business cycles that happen weekly. The MONTH function uncovers the yearly seasonality, and the YEAR function will help forecast annual growth.
Next, we'll do something that seems a bit surprising – we'll remove the date column. Since we've extracted the important features from the dates, we no longer need them for our model. Some people might object to this step because we're losing the ordering of the data that the dates provided. That ordering is essential to classic time series models like ARIMA and GARCH, but our experience in business forecasting is that this ordering plays only a minor role.
With our data prepared, it's time to create a machine learning model with the Prediction Laboratory add-in for Excel! If you haven't used the Prediction Laboratory before, this short video (link) will show you how to create a model in less than three minutes.
Be sure that the Event, WeekDay, Month, and Year columns in the TrainingData worksheet are Categorical variables.
Because we are trying to predict the revenue, the forecast is just a regular regression problem, and you can experiment with any of the regression models available. For this data set, a Gradient Boosted Tree model gave the best results with an R2 score of 0.75.
With a trained model, it's easy to make predictions for your business's revenue for future dates. Just repeat the process of converting dates to WeekDay, Month, and Year for your prediction data.
So, is our forecast model as good as a Prophet model? No. But that's to be expected – Prophet is a specialized framework for forecasting, yet our model gets us around 80% of the accuracy you can get from Prophet.
Building forecasts using Excel and the Prediction Laboratory is a great way to introduce predictive analytic into your organization. If your boss likes the forecasts and they become a part of the decision-making process, you can always take the next step and invest in a specialized model using Prophet or Fast.ai.