On the Concept of Depth for Functional Data Articles
Overview
published in
publication date
- June 2009
start page
- 718
end page
- 734
issue
- 486
volume
- 104
Digital Object Identifier (DOI)
International Standard Serial Number (ISSN)
- 0162-1459
Electronic International Standard Serial Number (EISSN)
- 1537-274X
abstract
- The statistical analysis of functional data is a growing need in many research areas. In particular, a robust methodology is important to study curves, which are the output of many experiments in applied statistics. As a starting point for this robust analysis, we propose, analyze, and apply a new definition of depth for functional observations based on the graphic representation of the curves. Given a collection of functions, it establishes the "centrality" of an observation and provides a natural center-outward ordering of the sample curves. Robust statistics, such as the median function or a trimmed mean function, can be defined from this depth definition. Its finite-dimensional version provides a new depth for multivariate data that is computationally feasible and useful for studying high-dimensional observations. Thus, this new depth is also suitable for complex observations such as microarray data, images, and those arising in some recent marketing and financial studies. Natural properties of these new concepts are established and the uniform consistency of the sample depth is proved. Simulation results show that the corresponding depth based trimmed mean presents better performance than other possible location estimators proposed in the literature for some contaminated models. Data depth can be also used to screen for outliers. The ability of the new notions of depth to detect "shape" outliers is presented. Several real datasets are considered to illustrate this new concept of depth, including applications to microarray observations, weather data, and growth curves. Finally, through this depth, we generalize to functions the Wilcoxon rank sum test. It allows testing whether two groups of curves come from the same population. This functional rank test when applied to children growth curves shows different growth patterns for boys and girls.