In the present paper, we study the application of time series forecasting methods to massive datasets of financial short time series. In our example, the time series arise from analyzing monthly expenses and incomings personal financial records. Unlike from traditional time series forecasting applications, we work with series of very short depth (as short as 24 data points), which does not allow us to use classical exponential smoothing methods. However, this shortcoming is compensated by the size of our dataset: millions of time series. This allows us to tackle the problem of time series prediction from a pattern recognition perspective. Specifically, we propose a method for short time series prediction based on time series clustering and distance-based regression. We experimentally show that this strategy leads to improved accuracy compared to exponential smoothing methods. In addition, we describe the underlying big data platform developed to carry out the efficient forecasting, since we perform millions of item comparisons in near real-time.
financial time series; big data; forecasting; conditional mean; holt winter; clustering