The evaluation of data sources using multivariate entropy tools Articles uri icon

publication date

  • July 2017

start page

  • 145

end page

  • 157


  • 78

International Standard Serial Number (ISSN)

  • 0957-4174

Electronic International Standard Serial Number (EISSN)

  • 1873-6793


  • We introduce from first principles an analysis of the information content of multivariate distributions as information sources. Specifically, we generalize a balance equation and a visualization device, the Entropy Triangle, for multivariate distributions and find notable differences with similar analyses done on joint distributions as models of information channels. As an example application, we extend a framework for the analysis of classifiers to also encompass the analysis of data sets. With such tools we analyze a handful of UCI machine learning task to start addressing the question of how well do datasets convey the information they are supposed to capture about the phenomena they stand for.


  • Telecommunications


  • machine learning evaluation; dataset entropy; multivariate entropy; entropic measures; exploratory analysis; entropy ternary diagram; entropy balance equation