How can I predict data

How forecasting works in Tableau

Forecasting in Tableau uses a method known as exponential smoothing. Forecasting algorithms try to find a regular pattern in the metrics that can be continued into the future. If you are interested in predictive modeling, which is also available in Tableau, see How Predictive Modeling Features Work in Tableau.

View video: You can find a presentation of the associated concepts in Tableau in the free training video Forecasts (Link opens in a new window) (duration: 6 minutes). Use your tableau.com account (link opens in a new window) to log in.

Typically, a forecast is placed in a view that contains a date field and one or more metrics. However, in the absence of a date, Tableau can forecast a view that includes an integer dimension and one or more measures.

To learn how to make a forecast, see Make a Forecast. For more information about forecasting using an integer dimension, see Running Forecasts Without a Date in the View.

Overview

All forecast algorithms are simple models of a real data generation process (DGP, Data Generating Process). For a high-quality forecast, a simple pattern in the DGP must correspond as closely as possible to the pattern described by the model. Quality metrics measure the compliance of the model with the DGP. If the quality is low, the accuracy measured by the confidence bands is not important because the accuracy of an inaccurate estimate is measured.

Tableau automatically selects the eight best models, with the best model generating the best quality forecast. The smoothing parameters for each model are optimized before Tableau assesses the forecast quality. The optimization method is global. It is therefore not impossible to select local optimal smoothing parameters that are not also globally optimal. Initial value parameters are selected according to best practice, but not further optimized. It can therefore happen that initial value parameters are not optimal. The eight models available in Tableau are described in the following location on the OTexts website: A Taxonomy of Exponential Smoothing Methods (Link opens in a new window)

If there is not enough data in the visualization, Tableau automatically tries to forecast with a better temporal fineness and then aggregates the forecast back to the fineness of the visualization. Tableau provides forecast bands that can be simulated or calculated from a closed equation. All models with a multiplicative component or with aggregated forecasts have simulated bands, while all other models use the closed equations.

Exponential smoothing and trend

Exponential smoothingModels iteratively forecast future values ​​for a regular time series of values ​​using the weighted values ​​from the past of the series. The simplest model Simple exponential smoothing, calculates the next level or the smoothing value from a weighted average of the last actual value and the last value of the level. The method is exponential, since each previous actual value has an exponentially increasing influence on the value of each level. More weight is attached to younger values.

Exponential smoothing models with trend or seasonal components are effective when the key figure to be forecast shows a trend or seasonality over the period on which the forecast is based. trend is the tendency for data to increase or decrease over time. Seasonality is a repetitive, predictable change in a value, such as B. the season-dependent annual fluctuation in temperature.

In general, the more data points there are in a time series, the better the result forecast. A sufficient amount of data is particularly important if the seasonality is to be represented in models, since the model is more complicated and requires more certainties in the form of data in order to achieve an acceptable level of accuracy. However, if you forecast data from at least two different DGPs, you will get a poorer forecast because the model can only match a single DGP.

Seasonality

Tableau tests for a seasonal cycle with the most typical length of the time series for which the forecast is made for time aggregation. So when you aggregate by months, Tableau looks for a 12 month cycle; if you aggregate by quarters, Tableau looks for a 4 month cycle, and if you aggregate by days, Tableau looks for weekly seasonality. So if there is a 6 month cycle in your series, Tableau will likely find a 12 month pattern with two similar child patterns. However, if there is a 7 month cycle in your time series, Tableau will likely not find a cycle at all. As a rule, however, 7-month cycles are unusual.

Tableau can determine the length of the season using two methods. In the original temporal method, the natural season length of the temporal granularity (TG) of the view is used. Temporal granularity is the smallest unit of time that is expressed in the view. For example, if the view contains a continuous (green) date truncated to months, or discrete (blue) year and month date ranges, the view has a monthly temporal granularity. With the new, non-temporal method (from Tableau 9.3), season lengths from 2 to 60 are checked for possible candidates using periodic regression.

Tableau automatically chooses the best possible method for a given view. If the measures in a view in Tableau are sorted by date and have a quarterly, monthly, weekly, daily, or hourly temporal granularity, the season lengths are likely to be 4, 12, 13, 7, and 24 respectively. exponential smoothing methods supported in Tableau are only constructed on the basis of the natural length of the TG. The AIC of the five seasonal models and the three non-seasonal models are compared and the lowest AIC is returned. (For an explanation of the AIC metric, see "Forecast Descriptions".)

When Tableau is forecasting from an integer dimension, the second method is used. In this case there is no temporal granularity (TG); the potential season lengths must therefore be derived from the data.

The second method also applies to annual temporal granularity. Annual series rarely show granularity; if so, they must also be derived from the data.

The second method also applies to views with minute or second temporal granularity. If these series show seasonality, the season lengths are very likely equal to 60. However, a regular, real process can take place at intervals that do not coincide with the timing according to the clock. Tableau also checks minutes and seconds. whether the data contain different lengths (not equal to 60). This does not mean that Tableau can model two different season lengths at the same time. Instead, ten seasonal models are estimated, including five models with a season length of 60 and another five models with the season length derived from the data. The forecast is then calculated using the model with the lowest AIC out of the ten seasonal or three non-seasonal models.

For annual, minute, or second series, a single season length is tested from the data if the pattern is relatively clear. For integer series, up to nine relatively unique potential season lengths are estimated for each of the five seasonal models, and the model with the lowest AIC is returned. If there are no likely candidates for the season length, only the non-seasonal models are estimated.

When potential season lengths are inferred from the data in Tableau, the model type "Automatic" is left unchanged in the "Forecast Options" dialog box, "Model Type" menu. With the option "Automatically without seasonality", the entire search for season lengths and the estimates of the seasonal models are omitted, so that the performance is increased.

The heuristic according to which Tableau decides whether the season lengths derived from the data should be used depends on the error distribution for the periodic regression of the individual candidates for the season lengths. If there is indeed seasonality in the data, then the collection of candidates for the season length usually yields one or two distinct season lengths. So if a single candidate is returned, it is very likely to indicate seasonality. In this case, Tableau uses this candidate to estimate the seasonal models for annual, minute, and second granularity. A return of fewer than ten candidates (the maximum number) means possible seasonality. In this case, Tableau estimates the seasonal models based on all the returned candidates for integer views. If the maximum number of candidates is returned, it means that similar errors have occurred for most of the season lengths. Seasonality is therefore unlikely. In this case, only the non-seasonal models for an integral or annual series are estimated in Tableau, for other time series only the seasonal models with natural season length.

With the "Automatic" model type for integer, annual, minute and second views, candidates for the season length are always derived from the data, regardless of whether these candidates are ultimately used or not. The model estimation takes significantly more time than the periodic regression; therefore, there is usually only a slight loss of performance.

Model types

In the Forecast Options dialog box, you can choose the Tableau User model type for forecasting. The Automatic setting is usually optimal for most views. If you select Custom, you can specify the trend and seasonal characteristics independently (none, additive or multiplicative):

In an additive model, the contributions of the model components are summed up. In a multiplicative model, some component contributions are multiplied. Multiplicative models provide significantly better forecast quality for data whose trend or seasonality is influenced by the scope (magnitude) of the data:

Note that you don't need to build a custom model to generate a multiplicative forecast: the Automatic setting can determine whether a multiplicative forecast is appropriate for your data. However, a multiplicative model cannot be calculated if the key figure to be forecast contains at least one value less than or equal to zero.

Forecast with time

When forecasting with a date, the view can only contain a single base date. Partial data is supported but must all refer to the same underlying field. The dates can be in rows, columns, or markers (but not in the ToolTip target).

Tableau supports three types of dates, two of which can be used for forecasting:

  • Truncated dates refer to a specific point in time with a certain temporal granularity, for example February 2017. These dates are usually continuous, so they have a green background in the view. Truncated dates can be used for forecasting.

  • Date ranges indicate a specific member of a time measure (for example, February). The date ranges are each represented by a separate, mostly discrete field (with a blue background). At least a "year" date range is required for forecasting. The following date ranges are allowed for forecasting:

    • year

    • Year + quarter

    • Year + month

    • Year + quarter + month

    • Year + week

    • Custom: month / year, month / day / year

    Other date ranges like quarter or Quarter + month are not allowed for forecasting. For more information about the different date types, see Converting Discrete Fields to Continuous Fields and Vice Versa.

  • Exact dates refer to a specific point in time with maximum temporal granularity, for example February 1, 2012 at 2: 23: 45.0. Exact dates are not allowed for forecasting.

Forecasts without a date are also possible. For more information, see Run Forecasts Without a Date in the View.

Delicacy and trim

When you make a forecast, you choose a date dimension that specifies a unit of time in which to measure time values. Tableau supports a variety of such time units, including year, quarter, month, and day. The unit you have chosen for the date value is also called fineness of the date.

The data contained in your key figure are usually not precisely aligned with your unit of fineness. You can set your date value to quarters, but then your actual dates must end in the middle of the quarter - for example, late November. This can cause problems because the value for this fraction of the quarter is treated in the forecast model as a full quarter, which consequently usually has a lower value than a full quarter. If the forecast model is allowed to include this data, the forecast result is inaccurate. The solution is to trim the data so that subsequent time periods that could distort the forecast are ignored. Use the Ignore Last option in the Forecast Options dialog box to remove or remove such partial periods trim. The default is to trim a time period.

Obtaining more data

Tableau needs at least five data points in the series to estimate a trend and enough data points for at least two seasons or one season plus five time periods to estimate seasonality. For example, estimating a 4-quarter seasonal cycle model requires nine data points (4 + 5) and a minimum of 24 data points to estimate a 12-month cycle (2 x 12) model.

In some cases, when you use forecasting on a view that does not have enough data points to correctly forecast, Tableau can retrieve enough data points to produce a valid forecast. This is done by querying the data source for a higher level of detail.

  • By default, if your view contains data for less than nine years, Tableau asks the data source for quarterly data, provides a quarterly forecast, and aggregates the annual forecast for display in your view. If that still doesn't produce enough data points, Tableau creates a monthly forecast and puts the aggregated annual forecast back into your view.

  • By default, if your view contains data for less than nine quarters, Tableau forecasts a monthly forecast and includes the aggregated results of the quarterly forecast in your view.

  • By default, if your view contains data for less than nine weeks, Tableau forecasts a daily forecast and includes the aggregated results for the weekly forecast in your view.

  • By default, if your view contains data for less than nine days, Tableau will provide an hourly forecast and populate your view with the aggregated results for the daily forecast.

  • By default, if your view contains data for less than nine hours, Tableau will provide a minute-by-minute forecast and populate your view with the aggregated results for the hourly forecast.

  • By default, if your view contains data for less than nine minutes, Tableau will forecast every second and populate your view with the aggregated results for the minute-by-minute forecast.

These changes happen in the background and no configuration is required. Tableau doesn't change the appearance of your visualization, and it doesn't actually change a date value. The summary of the forecast period in the Describe Forecast and Forecast Options dialog boxes will still reflect the granularity actually used.

Tableau can only get more data if the aggregation for the forecast metric is SUM or COUNT. For more information about available aggregation types and how to change the aggregation type, see Data Aggregation in Tableau.