Individual forecasts#
As Song et al. [2019] illustrates, in tourism research, time series models forecast demands, while econometric models search for the causes and effects between economic factors and tourism demands. The empirical models include time series models, in which SARIMAX and VAR build uni- and multi-variate relationships, respectively, econometric model, which attempts to establish a relationship between the Google Trends index and invented ratio, and the forecast combination by linear and least square approaches.
SARIMAX#
Built on the previous tourism research, a Seasonal Auto-Regressive Integrated Moving Average with eXogenous factors (SARIMAX) model would be employed to track each PIC’s VAs. A SARIMAX model consists of three elements: AR, MA, and X. The AR part of the model use the past values of the differenced series to make predictions, while the MA part uses the past errors of the model. An SARIMAX
where:
is the Visitor Arrivals at time for with coefficients denotes exogenous variables defined at each time step is the lag operator and is the seasonal lag operator is the order of the AR part, is the order of the MA part, and is the degree of first differencing involved is an order polynomial function of from the AR part, and is defined analogously to . is the integration operator and takes the seasonal differences of the series
The model employs the scaled logit transformation to avoid negative predictions, transforming
Vector AutoRegressive and Moving Average Model (VARMA)#
Given the potential relationship between offical VA and GAD data, VARMA model COULD improve the univariate time series modeling by considering the linkages between ISA from the GAD and the official VA data. A VAR(p) model with exogenous variables is formally expressed as:
where
Ratio Approach#
Unlike the VAR that the vector contains multiple series, the alternative way to link the GAD with official VA data is to produce a single ratio by setting
where:
is the Google Search Index data at time ; is a dummy variable set to be 1 after WHO announced the Covid-19 as global pandemic; from the OWID Global Stringency Index;
The limited sample size (smaller than 48) and potential autocorrelation would violate the homoskedasticity assumption by OLS but still produce an unbiased estimation. Thus, to correct the standard error, we employ the Heteroskedasticity- and Autocorrelation- Consistent estimator (HAC) and choose the lag same as Wooldridge suggests where
Model Evaluation#
To evaluate the model’s performance, benchmark results will be provided.2 Three benchmark methods are employed:
Average method, where the forecasts of all future values are equal to the average (or “mean”) of the historical data.
Naïve method, setting all forecasts to be the value of the last observation.
Drift method,
, which is equivalent to drawing a line between the first and last observations, and extrapolating it into the future.
For each method, Mean Squared Error (MSE), Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Symmetric Mean Absolute Percentage Error (SMAPE) are provided. 3
- 1
A hideen assumption behind employing the ratio approach is the airline company would dynamically adjust the to choose the combination of fares, aircraft size, and load factor to maximize profits in each market. See more in Graham, Kaplan, and Sibley (1983)
- 2
Rob J Hyndman and George Athanasopoulos, Forecasting: principles and practice (OTexts, 2018). See more here.
- 3
Mean Absolute Percentage Error (MAPE) is also a frequent calculation, but given there exists some zero values in the actual value, and the result would lead to
considering .