Accuracy Measures for Binary Classification Based on a Quantitative Variable

Rui Santos; Miguel Felgueiras; João Paulo Martins; Liliana Ferreira  Liliana Ferreira

doi:10.57805/revstat.v17i2.266

Authors

Rui Santos Polytechnic Institute of Leiria
Miguel Felgueiras Polytechnic Institute of Leiria
João Paulo Martins Polytechnic Institute of Leiria
Liliana Ferreira Liliana Ferreira Polytechnic Institute of Leiria

DOI:

https://doi.org/10.57805/revstat.v17i2.266

Keywords:

Binary classification, cut-point, ROC curve, sensitivity, specificity, simulation

Abstract

The identification of the right methodology to perform binary classification based on an observed quantitative variable is usually a complex choice. Thus, the use of appropriate accuracy measures is crucial. In fact, the ROC curve reveals a lot of information about the accuracy of the applied methodology for all the possible values of the cut-point. In particular, the integral and partial areas under the ROC curve are widely used. The φ index, in which sensitivity equals specificity, may also be applied. Nevertheless, the accuracy at one specific cut-point may be sufficient to assess the accuracy in some applications. Therefore, different ways to define the optimal cut-point may be applied, such as the maximization of the Youden index, the maximization of the concordance probability or the minimization of the distance to the point with absence of misclassification. To compare the adequacy of these measures, a simulation study was performed under different scenarios. The results highlight the advantages and disadvantages of each procedure and advise the use of the φ index.