On the Concept of Depth for Functional Data Articles uri icon

authors

  • LOPEZ PINTADO, SARA
  • ROMO URROZ, JUAN

publication date

  • June 2009

start page

  • 718

end page

  • 734

issue

  • 486

volume

  • 104

International Standard Serial Number (ISSN)

  • 0162-1459

Electronic International Standard Serial Number (EISSN)

  • 1537-274X

abstract

  • The statistical analysis of functional data is a growing need in many research areas. In particular, a robust methodology is important to study curves, which are the output of many experiments in applied statistics. As a starting point for this robust analysis, we propose, analyze, and apply a new definition of depth for functional observations based on the graphic representation of the curves. Given a collection of functions, it establishes the "centrality" of an observation and provides a natural center-outward ordering of the sample curves. Robust statistics, such as the median function or a trimmed mean function, can be defined from this depth definition. Its finite-dimensional version provides a new depth for multivariate data that is computationally feasible and useful for studying high-dimensional observations. Thus, this new depth is also suitable for complex observations such as microarray data, images, and those arising in some recent marketing and financial studies. Natural properties of these new concepts are established and the uniform consistency of the sample depth is proved. Simulation results show that the corresponding depth based trimmed mean presents better performance than other possible location estimators proposed in the literature for some contaminated models. Data depth can be also used to screen for outliers. The ability of the new notions of depth to detect "shape" outliers is presented. Several real datasets are considered to illustrate this new concept of depth, including applications to microarray observations, weather data, and growth curves. Finally, through this depth, we generalize to functions the Wilcoxon rank sum test. It allows testing whether two groups of curves come from the same population. This functional rank test when applied to children growth curves shows different growth patterns for boys and girls.