Electronic International Standard Serial Number (EISSN)
1558-2361
abstract
In this letter, we present a new class of segment-based features for speech, music and song discrimination. These features, called PHEQ (Polynomial-Fit Histogram Equalization), are derived from the nonlinear relationship between the short-term feature distributions computed at segment level and a reference distribution. Results show that PHEQ characteristics outperform short-term features such as Mel Frequency Cepstrum Coefficients (MFCC) and conventional segment-based ones such as MFCC mean and variance. Furthermore, the combination of short-term and PHEQ features significantly improves the performance of the whole system.