Evaluating probabilistic classifiers: the triptych

Probability forecasts for binary outcomes, often referred to as probabilistic classifiers or confidence scores, are ubiquitous in science and society, and methods for evaluating and comparing them are in great demand. We propose and study a triptych of diagnostic graphics that focus on distinct and...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Dimitriadis, Timo (VerfasserIn) , Gneiting, Tilmann (VerfasserIn) , Jordan, Alexander I. (VerfasserIn) , Vogel, Peter (VerfasserIn)
Dokumenttyp: Article (Journal) Kapitel/Artikel
Sprache:Englisch
Veröffentlicht: January 27, 2023
In: Arxiv
Year: 2023, Pages: 1-32
DOI:10.48550/arXiv.2301.10803
Online-Zugang:Verlag, kostenfrei, Volltext: https://doi.org/10.48550/arXiv.2301.10803
Verlag, kostenfrei, Volltext: http://arxiv.org/abs/2301.10803
Volltext
Verfasserangaben:Timo Dimitriadis, Tilmann Gneiting, Alexander I. Jordan, Peter Vogel

MARC

LEADER 00000caa a2200000 c 4500
001 1860283985
003 DE-627
005 20240307052959.0
007 cr uuu---uuuuu
008 230926s2023 xx |||||o 00| ||eng c
024 7 |a 10.48550/arXiv.2301.10803  |2 doi 
035 |a (DE-627)1860283985 
035 |a (DE-599)KXP1860283985 
035 |a (OCoLC)1425211034 
040 |a DE-627  |b ger  |c DE-627  |e rda 
041 |a eng 
084 |a 17  |2 sdnb 
100 1 |a Dimitriadis, Timo  |e VerfasserIn  |0 (DE-588)1230883045  |0 (DE-627)1753224217  |4 aut 
245 1 0 |a Evaluating probabilistic classifiers  |b the triptych  |c Timo Dimitriadis, Tilmann Gneiting, Alexander I. Jordan, Peter Vogel 
264 1 |c January 27, 2023 
300 |a 32 
336 |a Text  |b txt  |2 rdacontent 
337 |a Computermedien  |b c  |2 rdamedia 
338 |a Online-Ressource  |b cr  |2 rdacarrier 
500 |a Gesehen am 26.09.2023 
520 |a Probability forecasts for binary outcomes, often referred to as probabilistic classifiers or confidence scores, are ubiquitous in science and society, and methods for evaluating and comparing them are in great demand. We propose and study a triptych of diagnostic graphics that focus on distinct and complementary aspects of forecast performance: The reliability diagram addresses calibration, the receiver operating characteristic (ROC) curve diagnoses discrimination ability, and the Murphy diagram visualizes overall predictive performance and value. A Murphy curve shows a forecast's mean elementary scores, including the widely used misclassification rate, and the area under a Murphy curve equals the mean Brier score. For a calibrated forecast, the reliability curve lies on the diagonal, and for competing calibrated forecasts, the ROC and Murphy curves share the same number of crossing points. We invoke the recently developed CORP (Consistent, Optimally binned, Reproducible, and Pool-Adjacent-Violators (PAV) algorithm based) approach to craft reliability diagrams and decompose a mean score into miscalibration (MCB), discrimination (DSC), and uncertainty (UNC) components. Plots of the DSC measure of discrimination ability versus the calibration metric MCB visualize classifier performance across multiple competitors. The proposed tools are illustrated in empirical examples from astrophysics, economics, and social science. 
650 4 |a Computer Science - Machine Learning 
650 4 |a Statistics - Machine Learning 
650 4 |a Statistics - Methodology 
700 1 |a Gneiting, Tilmann  |e VerfasserIn  |0 (DE-588)1019627484  |0 (DE-627)690974809  |0 (DE-576)358470323  |4 aut 
700 1 |a Jordan, Alexander I.  |e VerfasserIn  |0 (DE-588)1027203264  |0 (DE-627)72857747X  |0 (DE-576)372589928  |4 aut 
700 1 |a Vogel, Peter  |d 198X-  |e VerfasserIn  |0 (DE-588)1179421345  |0 (DE-627)1067262806  |0 (DE-576)518237478  |4 aut 
773 0 8 |i Enthalten in  |t Arxiv  |d Ithaca, NY : Cornell University, 1991  |g (2023) vom: Jan., Artikel-ID 2301.10803, Seite 1-32  |h Online-Ressource  |w (DE-627)509006531  |w (DE-600)2225896-6  |w (DE-576)28130436X  |7 nnas  |a Evaluating probabilistic classifiers the triptych 
773 1 8 |g year:2023  |g month:01  |g elocationid:2301.10803  |g pages:1-32  |g extent:32  |a Evaluating probabilistic classifiers the triptych 
787 0 8 |i Forschungsdaten  |a Dimitriadis, Timo  |t Replication package for "Evaluating probabilistic classifiers: the triptych"  |d [San Francisco] : GitHub, Inc., 2023  |h 1 Online-Ressource  |w (DE-627)1860279503 
856 4 0 |u https://doi.org/10.48550/arXiv.2301.10803  |x Verlag  |x Resolving-System  |z kostenfrei  |3 Volltext 
856 4 0 |u http://arxiv.org/abs/2301.10803  |x Verlag  |z kostenfrei  |3 Volltext 
951 |a AR 
992 |a 20230926 
993 |a Article 
994 |a 2023 
998 |g 1230883045  |a Dimitriadis, Timo  |m 1230883045:Dimitriadis, Timo  |d 180000  |d 181000  |e 180000PD1230883045  |e 181000PD1230883045  |k 0/180000/  |k 1/180000/181000/  |p 1  |x j 
999 |a KXP-PPN1860283985  |e 4379011372 
BIB |a Y 
JSO |a {"note":["Gesehen am 26.09.2023"],"type":{"media":"Online-Ressource","bibl":"chapter"},"recId":"1860283985","language":["eng"],"person":[{"given":"Timo","family":"Dimitriadis","role":"aut","display":"Dimitriadis, Timo","roleDisplay":"VerfasserIn"},{"given":"Tilmann","family":"Gneiting","role":"aut","roleDisplay":"VerfasserIn","display":"Gneiting, Tilmann"},{"roleDisplay":"VerfasserIn","display":"Jordan, Alexander I.","role":"aut","family":"Jordan","given":"Alexander I."},{"roleDisplay":"VerfasserIn","display":"Vogel, Peter","role":"aut","family":"Vogel","given":"Peter"}],"title":[{"title_sort":"Evaluating probabilistic classifiers","title":"Evaluating probabilistic classifiers","subtitle":"the triptych"}],"physDesc":[{"extent":"32 S."}],"relHost":[{"title":[{"title_sort":"Arxiv","title":"Arxiv"}],"part":{"pages":"1-32","year":"2023","extent":"32","text":"(2023) vom: Jan., Artikel-ID 2301.10803, Seite 1-32"},"titleAlt":[{"title":"Arxiv.org"},{"title":"Arxiv.org e-print archive"},{"title":"Arxiv e-print archive"},{"title":"De.arxiv.org"}],"pubHistory":["1991 -"],"recId":"509006531","language":["eng"],"note":["Gesehen am 28.05.2024"],"disp":"Evaluating probabilistic classifiers the triptychArxiv","type":{"bibl":"edited-book","media":"Online-Ressource"},"id":{"zdb":["2225896-6"],"eki":["509006531"]},"origin":[{"publisher":"Cornell University ; Arxiv.org","dateIssuedKey":"1991","dateIssuedDisp":"1991-","publisherPlace":"Ithaca, NY ; [Erscheinungsort nicht ermittelbar]"}],"physDesc":[{"extent":"Online-Ressource"}]}],"name":{"displayForm":["Timo Dimitriadis, Tilmann Gneiting, Alexander I. Jordan, Peter Vogel"]},"origin":[{"dateIssuedKey":"2023","dateIssuedDisp":"January 27, 2023"}],"id":{"doi":["10.48550/arXiv.2301.10803"],"eki":["1860283985"]}} 
SRT |a DIMITRIADIEVALUATING2720