Improving the separation of direct and diffuse solar radiation components using machine learning by gradient boosting

authors

ALER MUR, RICARDO
GALVAN LEON, INES MARIA
RUIZ ARIAS, JOSE A.
GUEYMARD, CHRISTIAN

published in

Solar Energy Journal

publication date

July 2017

start page

558

end page

569

volume

150

Digital Object Identifier (DOI)

https://doi.org/10.1016/j.solener.2017.05.018

full text

http://hdl.handle.net/10016/30337

International Standard Serial Number (ISSN)

0038-092X

Electronic International Standard Serial Number (EISSN)

1471-1257

abstract

Based on a large and recently developed database of 1-min irradiance and ancillary data observations at 54 world stations, this study uses the gradient boosting Machine Learning (ML) technique to improve the process of components separation, through which the direct and diffuse solar radiation components are estimated from 1-min global horizontal irradiance data. Here, the XGBoost implementation of gradient boosting is used both with ensembles of linear and ensembles of non-linear weak prediction models. The predictions of 140 separation models of the literature are combined using XGBoost to overall improve the random errors of the predictions of the individual separation models at any of the validation sites. The minimum prediction error is essentially achieved by a combination of 26 out of the original 140 models, with no meaningful reduction in error by combining more models. Most of these 26 models use at least three inputs in addition to clearness index. In parallel, XGBoost is also used to separate the components directly from the inputs to the separation models. From the 24 possible inputs used in the original 140 separation models, only 14 are found relevant. These 14 inputs could be used with appropriate formalism to subsequently develop a better separation model. It is found that when the training and validation datasets are not collocated, the RMSD of the predictions increases, on average, 2% with respect to the case of collocated datasets. Overall, the present results indicate that a data-driven ML approach combining a limited number of existing models can be used to considerably decrease the currently large random errors associated with such models when used separately at high temporal frequency. (C) 2017 Elsevier Ltd. All rights reserved.

Improving the separation of direct and diffuse solar radiation components using machine learning by gradient boosting Articles

Overview

authors

published in

publication date

start page

end page

volume

Digital Object Identifier (DOI)

full text

International Standard Serial Number (ISSN)

Electronic International Standard Serial Number (EISSN)

abstract

Classification

subjects

keywords