A new distance for data sets in a reproducing Kernel Hilbert Space Context Articles uri icon

publication date

  • January 2014

start page

  • 222

end page

  • 229

volume

  • 8258

international standard serial number (ISSN)

  • 0302-9743

electronic international standard serial number (EISSN)

  • 1611-3349

abstract

  • In this paper we define distance functions for data sets in a reproduncing kernel Hilbert space (RKHS) context. To this aim we introduce kernels for data sets that provide a metrization of the power set. The proposed distances take into account the underlying generating probability distributions. In particular, we propose kernel distances that rely on the estimation of density level sets of the underlying data distributions, and that can be extended from data sets to probability measures. The performance of the proposed distances is tested on several simulated and real data sets.