On Visualizing Mixed-Type Data: A Joint Metric Approach to Profile Construction and Outlier Detection Articles uri icon

publication date

  • March 2018

start page

  • 207

end page

  • 239

issue

  • 2

volume

  • 47

International Standard Serial Number (ISSN)

  • 0049-1241

Electronic International Standard Serial Number (EISSN)

  • 1552-8294

abstract

  • Survey data are usually of mixed type (quantitative, multistate categorical, and/or binary variables). Multidimensional scaling (MDS) is one of the most extended methodologies to visualize the profile structure of the data. Since the past 60s, MDS methods have been introduced in the literature, initially in publications in the psychometrics area. Nevertheless, sensitivity and robustness of MDS configurations have been topics scarcely addressed in the specialized literature. In this work, we are interested in the construction of robust profiles for mixed-type data using a proper MDS configuration. To this end, we propose to compare different MDS configurations (coming from different metrics) through a combination of sensitivity and robust analysis. In particular, as an alternative to classical Gower's metric, we propose a robust joint metric combining different distance matrices, avoiding redundant information, via related metric scaling. The search for robustness and identification of outliers is done through a distance-based procedure related to geometric variability notions. In this sense, we propose a statistic for detecting multivariate outliers in the context of mixed-type data and evaluate its performance through a simulation study. Finally, we apply these techniques to a real data set provided by the largest humanitarian organization involved in social programs in Spain, where we are able to find in a robust way the most relevant factors defining the profiles of people that were under risk of being socially excluded in the beginning of the 2008 economic crisis.

keywords

  • gower distance; mds configurations; mixed-type data; outliers identification; related metric scaling; social vulnerability.