Variable selection with P-splines in functional linear regression: application in graft-versus-host disease. Articles uri icon

publication date

  • June 2020

start page

  • 1670

end page

  • 1686

volume

  • 62

International Standard Serial Number (ISSN)

  • 0323-3847

Electronic International Standard Serial Number (EISSN)

  • 1521-4036

abstract

  • This paper focuses on the problems of estimation and variable selection in the functional linear regression model (FLM) with functional response and scalar covariates. To this end, two different types of regularization (L1 and L2) are considered in this paper. On the one hand, a sample approach for functional LASSO in terms of basis representation of the sample values of the response variable is proposed. On the other hand, we propose a penalized version of the FLM by introducing a P‐spline penalty in the least squares fitting criterion. But our aim is to propose P‐splines as a powerful tool simultaneously for variable selection and functional parameters estimation. In that sense, the importance of smoothing the response variable before fitting the model is also studied. In summary, penalized (L1 and L2) and nonpenalized regression are combined with a presmoothing of the response variable sample curves, based on regression splines or P‐splines, providing a total of six approaches to be compared in two simulation schemes. Finally, the most competitive approach is applied to a real data set based on the graft‐versus‐host disease, which is one of the most frequent complications (30% &#-50%) in allogeneic hematopoietic stem&;8208#cell transplantation.

keywords

  • function-on-scalar regression; lasso; p-splines; variable selection