Imbalance example-dependent cost classification: A Bayesian based method Articles uri icon

publication date

  • March 2023

start page

  • 1

end page

  • 13

issue

  • Parte B

volume

  • 213

International Standard Serial Number (ISSN)

  • 0957-4174

Electronic International Standard Serial Number (EISSN)

  • 1873-6793

abstract

  • Example-dependent cost classification is a special case of pattern classification where the costs are specific for each individual pattern. Most of the practical applications related to this kind of classification problem exhibit class imbalance in the available data, thus including an additional difficulty to the classification task. This problem has high practical importance because it appears intrinsically in relevant application fields, such as Finance or Health. We propose to use a 2-step Bayesian methodology to solve this problem because its formulation allows the inclusion of the individual example costs in the classification and takes into account the class probabilities. In particular, the main contribution is to apply principled rebalancing classification algorithms in the first step: We propose 3 Neural Network based learning machines, WR-MLP, WSR-MLPE and WSR-DNN, to provide the estimates of the required conditional probabilities for the Bayesian test. Unlike some similar approaches in the literature that use heuristic methods in the first step, which in most cases require calibration mechanisms to compensate for the estimation biases, the consistency of the proposed estimates is theoretically supported, thus providing a clear potential advantage. Experiments with seven real-world datasets show that the proposed methods are competitive against eleven state-of-the-art benchmarks, and provide an advantage in the less favourable situations: cases with a strong imbalance and highly nonlinear classification borders.

subjects

  • Telecommunications