Predicting pregnancy outcomes using longitudinal information: a penalized splines mixed-effects model approach Articles uri icon

authors

  • DE LA CRUZ, ROLANDO
  • FUENTES, CLAUDIO
  • MEZA, CRISTIAN
  • LEE, DAE JIN
  • ARRIBAS GIL, ANA

publication date

  • June 2017

start page

  • 2120

end page

  • 2134

issue

  • 13

volume

  • 36

International Standard Serial Number (ISSN)

  • 0277-6715

Electronic International Standard Serial Number (EISSN)

  • 1097-0258

abstract

  • We propose a semiparametric nonlinear mixed-effects model (SNMM) using penalized splines to classify longitudinal data and improve the prediction of a binary outcome. The work is motivated by a study in which different hormone levels were measured during the early stages of pregnancy, and the challenge is using this information to predict normal versus abnormal pregnancy outcomes. The aim of this paper is to compare models and estimation strategies on the basis of alternative formulations of SNMMs depending on the characteristics of the data set under consideration. For our motivating example, we address the classification problem using a particular case of the SNMM in which the parameter space has a finite dimensional component (fixed effects and variance components) and an infinite dimensional component (unknown function) that need to be estimated. The nonparametric component of the model is estimated using penalized splines. For the parametric component, we compare the advantages of using random effects versus direct modeling of the correlation structure of the errors. Numerical studies show that our approach improves over other existing methods for the analysis of this type of data. Furthermore, the results obtained using our method support the idea that explicit modeling of the serial correlation of the error term improves the prediction accuracy with respect to a model with random effects, but independent errors. Copyright (C) 2017 John Wiley & Sons, Ltd.

keywords

  • classification models; correlated observations; longitudinal data; mixed-effects models; p-splines; lasso-type estimators; bayesian classification; regression-analysis; correlated errors; p-splines; hcg