clustering by compression data compression information distortion kolmogorov complexity normalized compression distance block-sorting clustering by compression compression algorithms data sets experimental evaluation information distortion knowledge discovery and data minings kolmogorov complexity normalized compression distance text clustering cluster analysis data mining data compression