Qarmalah, Najla M., Einbeck, Jochen and Coolen, Frank P. A. - Archives of Data Science, Series A

Article Details

Title Mixture Models for Prediction from Time Series, with Application to Energy Use Data
Authors Qarmalah, Najla M., Einbeck, Jochen and Coolen, Frank P. A.
Year 2017
Volume 2(1)
Abstract This paper aims to use mixture models to produce predictions from time series data. Given data of the form (ti;yi), i = 1;...;T, we propose a mixture model localized at time point tT with the k-th component as yi = mk(ti)+epsilonik with mixing proportions PIk(ti) such that 0 <= PIk(ti) <= 1 and the Sum of k =1 to k, PIk(ti) = 1, where K is the number of components. The mk(.) are smooth unspecified regression functions, and the errors epsilonik ~ N(0;sigma^2) are independently distributed. Estimation of this model is achieved through a kernel-weighted version of the EM-algorithm, using exponential kernels with different bandwidths (neighbourhood sizes) hk as weight functions. By modelling a mixture of local regressions at a target time point tT but with different bandwidths hk, the estimated mixture probabilities are informative for the amount of information available in the data set at the scale of resolution corresponding to each bandwidth. Nadaraya- Watson and local linear estimators are used to carry out the localized estimation step. For prediction at time point tT+1, adequate methods are provided for each local method, and compared to competing forecasting routines. The data under study give the energy use for Bolivia, Lebanon, and Greece from 1971 to 2011. 1 Introduction Mixture models play an important role in the statistical analysis of data thanks to their flexibility to model a wide variety of random phenomena. They have been successfully employed in marketing and econometrics (Frühwirth- Schnatter, 2001) as well as biology and epidemiology (Green and Richardson, 2002), to name a few out of a huge number of fields of application. One useful type of mixture models is the mixture of regression models. Mixtures of regression models are appropriate to use when the observations are from several subgroups with missing grouping identities, and in each subgroup, the response has a linear relationship with one or more other recorded variables. Many efforts have been made to extend such models as finite mixtures of generalized linear models which are comprehensively discussed by McLachlan and Peel (2004). Bayesian approaches for mixture regression models are summarized by Frühwirth-Schnatter (2006). Mixture models continue to be a topic of intense research activity, with special issues being edited in close succession (Böhning et al, 2014; Hinde et al, 2016). A large proportion of articles in those special issues discusses variants of mixture regression models, such as Poisson regression, spline regression, or regression under censoring. Recently, mixtures of nonparametric regression models, which relax the linearity assumption on the regression functions, have gained particular attention. For example,Young and Hunter (2010) use kernel regression to model covariatedependent proportions for mixtures of linear regression models, an idea which was further developed into a semi-parametric approach by Huang and Yao (2012). Huang et al (2013) have proposed a nonparametric finite regression mixture model where the mixing proportions, the mean functions, and the variance functions are all nonparametric, with application on the U.S. house price index (HPI) data. However, to our knowledge, there is no statistical method for prediction from time series based on mixture models and nonparametric regression. Nonparametric regression is a technique for modelling (possibly non-linear) trends in data. One approach to nonparametric regression is local modelling which locally estimates the mean function m(t) using a set of parametric models.