Validity, reliability, and significance: empirical methods for NLP and data science
Empirical methods are means to answering methodological questions of empirical sciences by statistical techniques. The methodological questions addressed in this book include the problems of validity, reliability, and significance. In the case of machine learning, these correspond to the questions o...
Saved in:
| Main Authors: | , |
|---|---|
| Format: | Book/Monograph |
| Language: | English |
| Published: |
San Rafael, CA
Morgan & Claypool Publishers
[2022]
|
| Series: | Synthesis lectures on human language technologies
#55 |
| In: |
Synthesis lectures on human language technologies (#55)
|
| Online Access: |
|
| Author Notes: | Stefan Riezler, Michael Hagmann |
| Summary: | Empirical methods are means to answering methodological questions of empirical sciences by statistical techniques. The methodological questions addressed in this book include the problems of validity, reliability, and significance. In the case of machine learning, these correspond to the questions of whether a model predicts what it purports to predict, whether a model's performance is consistent across replications, and whether a performance difference between two models is due to chance, respectively. The goal of this book is to answer these questions by concrete statistical tests that can be applied to assess validity, reliability, and significance of data annotation and machine learning prediction in the fields of NLP and data science. |
|---|---|
| ISBN: | 9781636392714 9781636392738 |