Quicksort leave-pair-out cross-validation for ROC curve analysis

dc.contributor.authorNumminen Riikka
dc.contributor.authorMontoya Perez Ileana
dc.contributor.authorJambor Ivan
dc.contributor.authorPahikkala Tapio
dc.contributor.authorAirola Antti
dc.contributor.organizationfi=data-analytiikka|en=Data-analytiikka|
dc.contributor.organizationfi=kuvantaminen ja kliininen diagnostiikka|en=Imaging and Clinical Diagnostics|
dc.contributor.organizationfi=terveysteknologia|en=Health Technology|
dc.contributor.organizationfi=tyks, vsshp|en=tyks, varha|
dc.contributor.organization-code1.2.246.10.2458963.20.28696315432
dc.contributor.organization-code1.2.246.10.2458963.20.68940835793
dc.contributor.organization-code1.2.246.10.2458963.20.69079168212
dc.converis.publication-id176696554
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/176696554
dc.date.accessioned2022-11-29T14:56:00Z
dc.date.available2022-11-29T14:56:00Z
dc.description.abstractReceiver Operating Characteristic (ROC) curve analysis and area under the ROC curve (AUC) are commonly used performance measures in diagnostic systems. In this work, we assume a setting, where a classifier is inferred from multivariate data to predict the diagnostic outcome for new cases. Cross-validation is a resampling method for estimating the prediction performance of a classifier on data not used for inferring it. Tournament leave-pair-out (TLPO) cross-validation has been shown to be better than other resampling methods at producing a ranking of data that can be used for estimating the ROC curves and areas under them. However, the time complexity of TLPOCV, O(n(2)), means that it is impractical in many applications. In this article, a method called quicksort leave-pair-out cross-validation (QLPOCV) is presented in order to decrease the time complexity of obtaining a reliable ranking of data to O(n log n). The proposed method is compared with existing ones in an experimental study, demonstrating that in terms of ROC curves and AUC values QLPOCV produces as accurate performance estimation as TLPOCV, outperforming both k-fold and leave-one-out cross-validation.
dc.identifier.eissn1613-9658
dc.identifier.jour-issn0943-4062
dc.identifier.olddbid190027
dc.identifier.oldhandle10024/173118
dc.identifier.urihttps://www.utupub.fi/handle/11111/29790
dc.identifier.urlhttps://doi.org/10.1007/s00180-022-01288-3
dc.identifier.urnURN:NBN:fi-fe2022110164030
dc.language.isoen
dc.okm.affiliatedauthorNumminen, Riikka
dc.okm.affiliatedauthorMontoya Perez, Ileana
dc.okm.affiliatedauthorJambor, Ivan
dc.okm.affiliatedauthorPahikkala, Tapio
dc.okm.affiliatedauthorAirola, Antti
dc.okm.affiliatedauthorDataimport, tyks, vsshp
dc.okm.discipline112 Statistics and probabilityen_GB
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline112 Tilastotiedefi_FI
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherSPRINGER HEIDELBERG
dc.publisher.countryGermanyen_GB
dc.publisher.countrySaksafi_FI
dc.publisher.country-codeDE
dc.relation.doi10.1007/s00180-022-01288-3
dc.relation.ispartofjournalComputational Statistics
dc.source.identifierhttps://www.utupub.fi/handle/10024/173118
dc.titleQuicksort leave-pair-out cross-validation for ROC curve analysis
dc.year.issued2022

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
s00180-022-01288-3.pdf
Size:
1.38 MB
Format:
Adobe Portable Document Format