On the Asymptotic Distribution of Cook's Distance in Logistic Regression Models Articles uri icon

authors

  • MARTIN APAOLAZA, NIRIAN
  • PARDO, LEANDRO

publication date

  • October 2009

start page

  • 1119

end page

  • 1146

issue

  • 10

volume

  • 36

International Standard Serial Number (ISSN)

  • 0266-4763

Electronic International Standard Serial Number (EISSN)

  • 1360-0532

abstract

  • It sometimes occurs that one or more components of the data exert a disproportionate influence on the model estimation. We need a reliable tool for identifying such troublesome cases in order to decide either eliminate from the sample, when the data collect was badly realized, or otherwise take care on the use of the model because the results could be affected by such components. Since a measure for detecting influential cases in linear regression setting was proposed by Cook [Detection of influential observations in linear regression, Technometrics 19 (1977), pp. 15&-18.], apart from the same measure for other models, several new measures have been suggested as single-case diagnostics. For most of them some cutoff values have been recommended (see [D.A. Belsley, E. Kuh, and R.E. Welsch, Regression Diagnostics: Identifying Influential Data and Sources of Collinearity, 2nd ed., John Wiley & Sons, New York, Chichester, Brisban, (2004).], for instance), however the lack of a quantile type cutoff for Cook's statistics has induced the analyst to deal only with index plots as worthy diagnostic tools. Focussed on logistic regression, the aim of this paper is to provide the asymptotic distribution of Cook's distance in order to look for a meaningful cutoff point for detecting influential and leverage observations.