Predictive performance of Bayesian structural time series and Google data
Myllymäki, Nikolai (2018-11-14)
Predictive performance of Bayesian structural time series and Google data
Myllymäki, Nikolai
(14.11.2018)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
avoin
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2018112849381
https://urn.fi/URN:NBN:fi-fe2018112849381
Tiivistelmä
Scott and Varian (2014) present a Bayesian structural time series method for short-term forecasting or "nowcasting" economic time series with the help of hundreds of explanatory variables obtained from search engine query data. In this thesis the predictive performance of the original models nowcasting U.S. initial unemployment claims and U.S. retail sales are examined for the five years of data accumulated since the publication of the original study. When nowcasting initial unemployment claims, Google data is found to reduce prediction root mean squared error when compared to a standard ARIMA model out of the original sample. Conversely, no notable performance improvements are found from using Google search query data in a Bayesian structural time series model to nowcast retail sales.
Possible causes for these disparities in prediction performance are studied, and the most compelling explanation found is that a highly informative prior was used in the original study in the case of the reference time series without any regression component. When a noninformative prior is used in the retail sales analysis over the original study period, the predictive performance of the baseline model increases to be on par with the models with Google data, eliminating the perceived advantage of models with Google data over a pure time series model.
Possible causes for these disparities in prediction performance are studied, and the most compelling explanation found is that a highly informative prior was used in the original study in the case of the reference time series without any regression component. When a noninformative prior is used in the retail sales analysis over the original study period, the predictive performance of the baseline model increases to be on par with the models with Google data, eliminating the perceived advantage of models with Google data over a pure time series model.