Validity, reliability, and significance: empirical methods for NLP and data science

Empirical methods are means to answering methodological questions of empirical sciences by statistical techniques. The methodological questions addressed in this book include the problems of validity, reliability, and significance. In the case of machine learning, these correspond to the questions o...

Full description

Saved in:
Bibliographic Details
Main Authors: Riezler, Stefan (Author) , Hagmann, Michael (Author)
Format: Book/Monograph
Language:English
Published: San Rafael, CA Morgan & Claypool Publishers [2022]
Series:Synthesis lectures on human language technologies #55
In: Synthesis lectures on human language technologies (#55)

Online Access: Get full text
Author Notes:Stefan Riezler, Michael Hagmann
Description
Summary:Empirical methods are means to answering methodological questions of empirical sciences by statistical techniques. The methodological questions addressed in this book include the problems of validity, reliability, and significance. In the case of machine learning, these correspond to the questions of whether a model predicts what it purports to predict, whether a model's performance is consistent across replications, and whether a performance difference between two models is due to chance, respectively. The goal of this book is to answer these questions by concrete statistical tests that can be applied to assess validity, reliability, and significance of data annotation and machine learning prediction in the fields of NLP and data science.
ISBN:9781636392714
9781636392738