Ungoliant: An optimized pipeline for the generation of a very large-scale multilingual web corpus

Saved in:
Bibliographic Details
Main Authors: Abadji, Julien (Author) , Ortiz Suárez, Pedro Javier (Author) , Romary, Laurent (Author) , Sagot, Benoît (Author)
Other Authors: Lüngen, Harald (Editor) , Kupietz, Marc (Editor) , Bański, Piotr (Editor) , Barbaresi, Adrien (Editor) , Clematide, Simon (Editor) , Pisetta, Ines (Editor)
Format: Conference Paper
Language:English
Published: Mannheim Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek 2021
DOI:10.14618/ids-pub-10468
Online Access:Resolving-System: https://doi.org/10.14618/ids-pub-10468
Resolving-System: https://nbn-resolving.org/urn:nbn:de:bsz:mh39-104688
Langzeitarchivierung Nationalbibliothek: https://d-nb.info/1237268664/34
Verlag, kostenfrei: https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/10468
Get full text
Author Notes:Julien Abadji, Pedro Javier Ortiz Suárez, Laurent Romary, Benoît Sagot ; Herausgeber: Harald Lüngen, Marc Kupietz, Piotr Bański, Adrien Barbaresi, Simon Clematide, Ines Pisetta
Description
Item Description:In: Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-9) 2021. Limerick, 12 July 2021 (Online-Event). - Mannheim : Leibniz-Institut für Deutsche Sprache, 2021, S. 1-9
Physical Description:Online Resource
DOI:10.14618/ids-pub-10468