Ungoliant: An optimized pipeline for the generation of a very large-scale multilingual web corpus
Gespeichert in:
| Hauptverfasser: | , , , |
|---|---|
| Weitere Verfasser: | , , , , , |
| Dokumenttyp: | Konferenzschrift |
| Sprache: | Englisch |
| Veröffentlicht: |
Mannheim
Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek
2021
|
| DOI: | 10.14618/ids-pub-10468 |
| Online-Zugang: | Resolving-System: https://doi.org/10.14618/ids-pub-10468 Resolving-System: https://nbn-resolving.org/urn:nbn:de:bsz:mh39-104688 Langzeitarchivierung Nationalbibliothek: https://d-nb.info/1237268664/34 Verlag, kostenfrei: https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/10468 |
| Verfasserangaben: | Julien Abadji, Pedro Javier Ortiz Suárez, Laurent Romary, Benoît Sagot ; Herausgeber: Harald Lüngen, Marc Kupietz, Piotr Bański, Adrien Barbaresi, Simon Clematide, Ines Pisetta |
MARC
| LEADER | 00000cam a2200000 c 4500 | ||
|---|---|---|---|
| 001 | 1788941799 | ||
| 003 | DE-627 | ||
| 005 | 20250106002954.0 | ||
| 007 | cr uuu---uuuuu | ||
| 008 | 220208s2021 gw |||||o 00| ||eng c | ||
| 015 | |a 21,O08 |2 dnb | ||
| 016 | 7 | |a 1237268664 |2 DE-101 | |
| 024 | 7 | |a urn:nbn:de:bsz:mh39-104688 |2 urn | |
| 024 | 7 | |a 10.14618/ids-pub-10468 |2 doi | |
| 035 | |a (DE-627)1788941799 | ||
| 035 | |a (DE-599)DNB1237268664 | ||
| 035 | |a (OCoLC)1409804370 | ||
| 040 | |a DE-627 |b ger |c DE-627 |e rakwb | ||
| 041 | |a eng | ||
| 044 | |c XA-DE-BW | ||
| 082 | 0 | |a 410.1 |q DE-101 | |
| 082 | 0 | 4 | |a 400 |q DE-101 |
| 084 | |a 28 |2 sdnb | ||
| 084 | |a 51 |2 sdnb | ||
| 100 | 1 | |a Abadji, Julien |e VerfasserIn |4 aut | |
| 245 | 1 | 0 | |a Ungoliant: An optimized pipeline for the generation of a very large-scale multilingual web corpus |c Julien Abadji, Pedro Javier Ortiz Suárez, Laurent Romary, Benoît Sagot ; Herausgeber: Harald Lüngen, Marc Kupietz, Piotr Bański, Adrien Barbaresi, Simon Clematide, Ines Pisetta |
| 264 | 1 | |a Mannheim |b Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek |c 2021 | |
| 300 | |a 1 Online-Ressource | ||
| 336 | |a Text |b txt |2 rdacontent | ||
| 337 | |a Computermedien |b c |2 rdamedia | ||
| 338 | |a Online-Ressource |b cr |2 rdacarrier | ||
| 500 | |a In: Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-9) 2021. Limerick, 12 July 2021 (Online-Event). - Mannheim : Leibniz-Institut für Deutsche Sprache, 2021, S. 1-9 | ||
| 700 | 1 | |a Ortiz Suárez, Pedro Javier |e VerfasserIn |4 aut | |
| 700 | 1 | |a Romary, Laurent |d 1964- |e VerfasserIn |0 (DE-588)1052510752 |0 (DE-627)788577808 |0 (DE-576)408216131 |4 aut | |
| 700 | 1 | |a Sagot, Benoît |e VerfasserIn |4 aut | |
| 700 | 1 | |a Lüngen, Harald |e HerausgeberIn |4 edt | |
| 700 | 1 | |a Kupietz, Marc |d 1971- |e HerausgeberIn |0 (DE-588)1023035693 |0 (DE-627)717351920 |0 (DE-576)308136020 |4 edt | |
| 700 | 1 | |8 1\p |a Bański, Piotr |e HerausgeberIn |0 (DE-588)1141205777 |0 (DE-627)898963494 |0 (DE-576)49426442X |4 edt | |
| 700 | 1 | |a Barbaresi, Adrien |e HerausgeberIn |0 (DE-588)1141699702 |0 (DE-627)100069058X |0 (DE-576)494469757 |4 edt | |
| 700 | 1 | |a Clematide, Simon |d 1968- |e HerausgeberIn |0 (DE-588)1121609511 |0 (DE-627)874379822 |0 (DE-576)480897816 |4 edt | |
| 700 | 1 | |a Pisetta, Ines |e HerausgeberIn |4 edt | |
| 856 | 4 | 0 | |u https://doi.org/10.14618/ids-pub-10468 |v 2022-02-07 |x Resolving-System |
| 856 | 4 | 0 | |u https://nbn-resolving.org/urn:nbn:de:bsz:mh39-104688 |v 2022-02-07 |x Resolving-System |
| 856 | 4 | 0 | |u https://d-nb.info/1237268664/34 |v 2022-02-07 |x Langzeitarchivierung Nationalbibliothek |
| 856 | 4 | 0 | |u https://ids-pub.bsz-bw.de/frontdoor/index/index/docId/10468 |q application/pdf |v 2022-02-07 |x Verlag |z kostenfrei |
| 883 | |8 1\p |a cgwrk |d 20241001 |q DE-101 |u https://d-nb.info/provenance/plan#cgwrk | ||
| 951 | |a BO | ||
| 992 | |a 20231120 | ||
| 993 | |a ConferencePaper | ||
| 994 | |a 2021 | ||
| 998 | |g 1023035693 |a Kupietz, Marc |m 1023035693:Kupietz, Marc |d 90000 |d 90500 |e 90000PK1023035693 |e 90500PK1023035693 |k 0/90000/ |k 1/90000/90500/ |p 2 | ||
| 999 | |a KXP-PPN1788941799 |e 4414083761 | ||
| BIB | |a Y | ||
| JSO | |a {"recId":"1788941799","name":{"displayForm":["Julien Abadji, Pedro Javier Ortiz Suárez, Laurent Romary, Benoît Sagot ; Herausgeber: Harald Lüngen, Marc Kupietz, Piotr Bański, Adrien Barbaresi, Simon Clematide, Ines Pisetta"]},"origin":[{"publisherPlace":"Mannheim","dateIssuedKey":"2021","dateIssuedDisp":"2021","publisher":"Leibniz-Institut für Deutsche Sprache (IDS), Bibliothek"}],"person":[{"display":"Abadji, Julien","given":"Julien","family":"Abadji","role":"aut"},{"role":"aut","given":"Pedro Javier","display":"Ortiz Suárez, Pedro Javier","family":"Ortiz Suárez"},{"role":"aut","display":"Romary, Laurent","given":"Laurent","family":"Romary"},{"role":"aut","given":"Benoît","display":"Sagot, Benoît","family":"Sagot"},{"display":"Lüngen, Harald","given":"Harald","family":"Lüngen","role":"edt"},{"role":"edt","family":"Kupietz","given":"Marc","display":"Kupietz, Marc"},{"role":"edt","given":"Piotr","display":"Bański, Piotr","family":"Bański"},{"role":"edt","display":"Barbaresi, Adrien","given":"Adrien","family":"Barbaresi"},{"family":"Clematide","display":"Clematide, Simon","given":"Simon","role":"edt"},{"role":"edt","family":"Pisetta","given":"Ines","display":"Pisetta, Ines"}],"language":["eng"],"title":[{"title_sort":"Ungoliant: An optimized pipeline for the generation of a very large-scale multilingual web corpus","title":"Ungoliant: An optimized pipeline for the generation of a very large-scale multilingual web corpus"}],"type":{"media":"Online-Ressource","bibl":"book"},"id":{"eki":["1788941799"],"doi":["10.14618/ids-pub-10468"],"uri":["urn:nbn:de:bsz:mh39-104688"]},"physDesc":[{"extent":"1 Online-Ressource"}],"note":["In: Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-9) 2021. Limerick, 12 July 2021 (Online-Event). - Mannheim : Leibniz-Institut für Deutsche Sprache, 2021, S. 1-9"]} | ||
| SRT | |a ABADJIJULIUNGOLIANTA2021 | ||