Facebook Prophet is an Open-Source library developed by Facebook’s in-house data science team to address time series based forecasting problems. This blog will give you insights on some of the key features that make this model stand out from the rest. A Practical guide on how to implement Prophet can be found here . So What are you waiting for – Let’s get started!
Before we jump into the blog, Let us quickly understand what time series forecasting is!
What is Time Series Forecasting?
Time Series Forecasting is a common machine learning problem with wide applicability across different domains! The main characteristics of Time Series Forecasting are understanding variations in Trend and Seasonal Patterns associated with time. This makes time series forecasting complex compared to other ML problems! Given the wide applicability of time series forecasting, many models have been developed off-late and in this post, we will be focusing on Facebook’s Prophet, a prominent Time series forecasting model used across the industry! Let’s get started!
Introduction to Facebook’s Prophet
Prophet is a powerful open-source library built by Facebook specifically to solve time-series problems. It has many inbuilt features to address some of the common challenges we have in time series forecasting. The model proceeds in a block-wise manner throughout the dataset, which leads to automatic capturing of trends, weekday/weekend movements, and seasonal patterns. In this blog, we will discuss some of the inbuilt features of the Prophet.
Official documentation on Prophet can be found here
Advantages of Prophet
So What makes Facebook Prophet special?. Well, it’s a full-blown library built taking into consideration most of the common challenges in time series forecasting. Following are few advantages we could list out:
- The primary advantage of Facebook’s Prophet is that any person without prior experience or in-depth knowledge in time series modeling can workaround.
- Facebook’s Prophet is accurate and fast ( which uses the Stan Platform for computation and predictions ).
- Prophet allows adjustment of parameters, customized seasonality components which may improve the forecasts.
- Prophet can also handle outliers and handles other data issues by itself.
Well, the list can go on and on!
Before we deep-dive into features of Prophet let us understand the basic math used in the prophet:
y(t) = g(t) + s(t) + h(t) + e(t)
g(t) = linear or logistic growth with respect to time series data.
s(t) = seasonal component ( daily / weekly / quarterly / yearly ).
h(t) = holidays/events effects.
e(t) = error term caused due to unexpected occurrences.
Detailed explanation on Prophet’s Parameters
This part of the blog will provide you an in-depth understanding of different parameters of the Prophet and its features. These parameters can be fine-tuned which empowers the users to create complex time series forecasting models easily. Listed below are the parameters with a brief description of its function.
- Parameters of Trend
- Saturating Growth
- Parameters of Seasonality
- Parameters of Holidays and Events
- Trend Changepoints
- Additional Regressor
Parameters of Trend
Based on the pattern of the trend present in the data , we will be able to define growth parameter of the Prophet Model.
This parameter has to be specified as ‘linear’ or ‘logistic’ based on the nature of the data.
- growth = ‘linear’ , If there is a linear trend in the data
Example : Rise in experience increases salary.
- growth = ‘logistic’ , If there is a non-linear trend
Example : Rise in the beginning and fall at the end of year.
The above image clearly shows the difference between linear and non – linear trends.
Saturating growth comes into play when there is an increasing growth such as an increase in revenue of the company. We will require our forecasts to saturate at some point (threshold – level) in the future. This threshold level is known as Carrying capacity denoted by ‘cap’.
In the same manner, the forecasts should go down to a point , a minimum threshold level. This is represented as ‘floor’.
There should be two columns named ‘cap’ and ‘floor’ assigned to the
- training set before fitting the model
- future set before forecasting
Parameters of Seasonality
Seasonal components play a vital role in time series forecasting. These components are periodic events that are mostly stable with respect to time. This is referred to as the seasonality of time series.
Prophet comes with the following three parameters to tune the seasonality
- In Built Seasonality
- Custom seasonality
Prophet fits the additive seasonality to its model, an effect added to the trend for forecasting. By default, Prophets uses additive seasonality. There is an option for seasonality mode to be multiplicative, which can be specified when there are rise and fall in the trend.
Data with “linear trend” shows that there is an additive seasonality whereas data with “non-linear trend” shows that there is a rise at the beginning of the hour and a fall at some points, which decides the mode to be multiplicative.
By default, Prophet will be able to automatically fit seasonalities based on daily, weekly and yearly basis. Inorder to use these default parameter, the value should be set to ‘True’.
Inorder to use custom values for the seasonal components, the value should be set to ‘False’.
Custom Seasonality can be specified using add_seasonality function.
- Yearly basis.
There should be three arguments which are to be specified in add_seasonality function.
name : The name of the seasonality should be specified in this argument as a string.
Period is a numeric value ( days ) which represents the repetition of the seasonal pattern.
Here, period = 1 means there is a daily seasonal trend repeating every 1 day once.
Here, if the name is set to yearly, then there will be a repetition of the pattern every 365 days.
Fourier order represents the movement of data in the trend.
Playing around different values of fourier_order gives a better understanding.
Figure 1.1 : Yearly seasonality with different fourier_order
Images shown in Figure 1.1 shows the seasonality pattern captured using different fourier orders. The image on the top has fourier_order set to 1. The image on the bottom shows the difference in the graph with many wavy patterns as its value of fourier_order = 48.
Parameters of Holiday / Events
We may deal with many holidays and events that take place in a time series data. It is necessary to include these important dates to our forecasting model as there may be different patterns on these dates.
Prophet allows us to specify these dates based on two ways :
- In – built function
- External variable
In-Built Holidays Effects
Prophet provides in-built functionality for adding these holidays and passing the trend captured to the forecasting model.
add_country_holidays is an argument which receives country code. This will get us all the holidays in the country’s calendar.
There are approximately 13 countries for which Prophet provides in-built holidays for the specified country.
Apart from in-built holidays, there may be many other important dates ( events in the past / in future ) that should be specified to the forecasting model.
Custom holidays should be assigned as a dataframe with two columns namely ‘holiday’ and ‘ds’.
Prophet additionally allows to add regressors that may add effect to the forecasting model.
These can be implemented using the ‘add_regressor’ function. The regressor should be added to
- Training set before fit method,
- Future set before forecast method.
One main feature of Facebook’s Prophet is it is fully automatic and it is capable of capturing trends and seasonal patterns on its own. Changepoints meaning any changes in points on timely-ordered data. These can either be left at the default level or customized.
‘add_changepoints_to_plot’ is a function that allows us to visualize the changepoints in our data.
Changepoints come into play when the model faces problems of overfitting or underfitting. These can be addressed using ‘changepoint_prior_scale’. The trend will become flexible when the changepoint_prior_scale is increased and vice versa.
Future and Forecasts
Defining Future datasets
It is important to define the period ( number of days ) of the future set with frequency.
‘make_future_dataframe’ is a function that consists of two arguments namely ‘period’ and freq’.
- period is the number of days the model should forecast.
- freq is the frequency (a measurement of time ) of the future set.
- It can be ‘30min’ for 30 minutes time period,
- ‘H’ for hourly,
- ‘D’ for daily.
If frequency is not mentioned, the default value will be in ‘days’.
Features of Forecasts
Forecasting time series using Prophet will provide predictions made on defined future sets and components of forecasts.
Forecasts can be made using the ‘predict’ function on the Prophet instance for the future sets.
The forecasts can be visualized using the ‘plot’ function on the forecast variable.
Similarly, the components of forecasts can be visualized using ‘plot_components’ on future sets. Plot components will result in trends in the data, seasonality added in Prophet model. This helps to understand the patterns captured by the forecast function.
A plot of forecasts and its components generated on the future set:
Diagnostics in Prophet
Prophet’s other impressive functionality is Diagnostics which provides a cross-validation method in forecasting and measuring performance errors. The function can be used by importing cross_validation from Prophet.
from fbprophet.diagnostics import cross_validation (Diagnostics | Prophet)
- The training samples are the specified initial number of days/hours.
- The samples for forecasting will be taken from cutoff and cutoff + horizon.
- The forecast will take place every number of periods defined.
- The number of samples as forecasts will be equal to the horizon value.
Brief description on functions of Diagnostics
It is important to understand the parameters while working on Diagnostics. Like other modules of Prophet, Diagnostics also comes with a range of parameters that makes this model very flexible.
Initial value is the number of training samples that is required for the model to forecast.
The value of period is the number of days / hours , the forecasts should be done.
This parameter is important for initializing the cutoff value. Horizon values are the output of cross-validation. Metrics will be generated based on the specified number of days/hours in the horizon.
Parameter Cutoff value is a measurement in days/hours. The cutoff value is the difference between the maximum ‘ds’ in the dataset and the specified horizon.
This concludes our introductory post on Facebook Prophet! Facebook’s Prophet is a solid library with many inbuilt features to handle challenges faced in time series forecasting. Users can create complex time series forecasting models with very little programming knowledge. I hope you got a grip on the basic features of Prophet!. You can find a coded example of the Facebook Prophet – Experimenting Facebook Prophet!
Thanks to you! I hope you will now be able to kick start your work on Time Series Forecasting using Facebook’s Prophet. If you find the blog helpful, Do Like and Share.
Stay Tuned and follow Digital Tesseract for many more such blogs on Data Science .
Official page of Prophet – https://facebook.github.io/prophet/
An Introductory Study on Time Series Modeling and Forecasting – https://arxiv.org/ftp/arxiv/papers/1302/1302.6613.pdf