Detecting annotation noise in automatically labelled data

We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost. Our method combines an unsupervised generative model with human supervision from active learning. We test our approach on in-domain and o...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Rehbein, Ines (VerfasserIn) , Ruppenhofer, Josef (VerfasserIn)
Dokumenttyp: Kapitel/Artikel Konferenzschrift
Sprache:Englisch
Veröffentlicht: July 2017
In: The 55th Annual Meeting of the Association for Computational Linguistics - proceedings of the conference ; Vol. 1: Long papers
Year: 2017, Pages: 1160-1170
DOI:10.18653/v1/P17-1107
Online-Zugang:Verlag, lizenzpflichtig, Volltext: https://doi.org/10.18653/v1/P17-1107
Verlag, lizenzpflichtig, Volltext: https://www.aclweb.org/anthology/P17-1107
Volltext
Verfasserangaben:Ines Rehbein, Josef Ruppenhofer

MARC

LEADER 00000caa a2200000 c 4500
001 1693636816
003 DE-627
005 20220818023618.0
007 cr uuu---uuuuu
008 200331s2017 xx |||||o 00| ||eng c
024 7 |a 10.18653/v1/P17-1107  |2 doi 
035 |a (DE-627)1693636816 
035 |a (DE-599)KXP1693636816 
035 |a (OCoLC)1341312817 
040 |a DE-627  |b ger  |c DE-627  |e rda 
041 |a eng 
084 |a 28  |2 sdnb 
100 1 |a Rehbein, Ines  |e VerfasserIn  |0 (DE-588)1207353833  |0 (DE-627)1693632373  |4 aut 
245 1 0 |a Detecting annotation noise in automatically labelled data  |c Ines Rehbein, Josef Ruppenhofer 
264 1 |c July 2017 
300 |a 11 
336 |a Text  |b txt  |2 rdacontent 
337 |a Computermedien  |b c  |2 rdamedia 
338 |a Online-Ressource  |b cr  |2 rdacarrier 
500 |a Gesehen am 31.03.2020 
520 |a We introduce a method for error detection in automatically annotated text, aimed at supporting the creation of high-quality language resources at affordable cost. Our method combines an unsupervised generative model with human supervision from active learning. We test our approach on in-domain and out-of-domain data in two languages, in AL simulations and in a real world setting. For all settings, the results show that our method is able to detect annotation errors with high precision and high recall. 
700 1 |a Ruppenhofer, Josef  |d 1971-  |e VerfasserIn  |0 (DE-588)132071037  |0 (DE-627)517466287  |0 (DE-576)298927993  |4 aut 
773 0 8 |i Enthalten in  |a Association for Computational Linguistics. Annual meeting (55. : 2017 : Vancouver, B.C.)  |t The 55th Annual Meeting of the Association for Computational Linguistics - proceedings of the conference ; Vol. 1: Long papers  |d Stroudsburg, PA : Association for Computational Linguistics (ACL), 2017  |g (2017), Seite 1160-1170  |h 1 Online-Ressource (lx, 2143 Seiten, 116,74 MB)  |w (DE-627)895622025  |w (DE-576)9895622023  |z 9781945626753  |7 nnam 
773 1 8 |g year:2017  |g pages:1160-1170  |g extent:11  |a Detecting annotation noise in automatically labelled data 
787 0 8 |i Forschungsdaten  |a Rehbein, Ines  |t MACE-AL  |d Heidelberg : Universität, 2020  |h 1 Online-Ressource (1 File)  |w (DE-627)1693633965 
856 4 0 |u https://doi.org/10.18653/v1/P17-1107  |x Verlag  |x Resolving-System  |z lizenzpflichtig  |3 Volltext 
856 4 0 |u https://www.aclweb.org/anthology/P17-1107  |x Verlag  |z lizenzpflichtig  |3 Volltext 
951 |a AR 
992 |a 20200331 
993 |a ConferencePaper 
994 |a 2017 
998 |g 1207353833  |a Rehbein, Ines  |m 1207353833:Rehbein, Ines  |d 90000  |d 90500  |e 90000PR1207353833  |e 90500PR1207353833  |k 0/90000/  |k 1/90000/90500/  |p 1  |x j 
999 |a KXP-PPN1693636816  |e 3616573475 
BIB |a Y 
JSO |a {"type":{"media":"Online-Ressource","bibl":"chapter"},"note":["Gesehen am 31.03.2020"],"language":["eng"],"recId":"1693636816","person":[{"given":"Ines","family":"Rehbein","role":"aut","display":"Rehbein, Ines","roleDisplay":"VerfasserIn"},{"given":"Josef","family":"Ruppenhofer","role":"aut","display":"Ruppenhofer, Josef","roleDisplay":"VerfasserIn"}],"title":[{"title_sort":"Detecting annotation noise in automatically labelled data","title":"Detecting annotation noise in automatically labelled data"}],"physDesc":[{"extent":"11 S."}],"relHost":[{"titleAlt":[{"title":"Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics"}],"part":{"pages":"1160-1170","year":"2017","extent":"11","text":"(2017), Seite 1160-1170"},"language":["eng"],"corporate":[{"role":"aut","display":"Association for Computational Linguistics (55., 2017, Vancouver, B.C.)","roleDisplay":"VerfasserIn"},{"display":"Association for Computational Linguistics","roleDisplay":"Herausgebendes Organ","role":"isb"}],"recId":"895622025","disp":"Association for Computational Linguistics. Annual meeting (55. : 2017 : Vancouver, B.C.)The 55th Annual Meeting of the Association for Computational Linguistics - proceedings of the conference ; Vol. 1: Long papers","type":{"media":"Online-Ressource","bibl":"book"},"note":["\"Editors: Regina Barzilay, Min-Yen Kan\" - Startseite der Ressource","Literaturangaben"],"title":[{"title":"Long papers"}],"person":[{"given":"Regina","family":"Barzilay","role":"edt","display":"Barzilay, Regina","roleDisplay":"HerausgeberIn"},{"family":"Kan","given":"Min-Yen","display":"Kan, Min-Yen","roleDisplay":"HerausgeberIn","role":"edt"}],"relMultPart":[{"ga":1,"id":{"eki":["895621339"]},"origin":[{"publisher":"Association for Computational Linguistics (ACL)","dateIssuedDisp":"[2017]","publisherPlace":"Stroudsburg, PA"}],"title":[{"title":"The 55th Annual Meeting of the Association for Computational Linguistics - proceedings of the conference","subtitle":"July 30-August 4, 2017, Vancouver, Canada : ACL 2017","title_sort":"55th Annual Meeting of the Association for Computational Linguistics - proceedings of the conference"}],"language":["eng"],"corporate":[{"display":"Association for Computational Linguistics","roleDisplay":"Herausgebendes Organ","role":"isb"}],"recId":"895621339","type":{"bibl":"edited-book","media":"Online-Ressource"},"part":{"number":["Vol. 1"],"number_sort":["1.2017"]}}],"physDesc":[{"noteIll":"Illustrationen","extent":"1 Online-Ressource (lx, 2143 Seiten, 116,74 MB)"}],"id":{"eki":["895622025"],"isbn":["9781945626753"]},"origin":[{"publisherPlace":"Stroudsburg, PA","dateIssuedDisp":"[2017]","publisher":"Association for Computational Linguistics (ACL)","dateIssuedKey":"2017"}]}],"name":{"displayForm":["Ines Rehbein, Josef Ruppenhofer"]},"origin":[{"dateIssuedDisp":"July 2017","dateIssuedKey":"2017"}],"id":{"doi":["10.18653/v1/P17-1107"],"eki":["1693636816"]}} 
SRT |a REHBEININEDETECTINGA2017