Crosslinguistic semantic textual similarity of Buddhist Chinese and Classical Tibetan

In this paper we present the first-ever procedure for identifying highly similar sequences of text in Chinese and Tibetan translations of Buddhist sūtra literature. We initially propose this procedure as an aid to scholars engaged in the philological study of Buddhist documents. We create a cross-l...

Ausführliche Beschreibung

Gespeichert in:
Bibliographische Detailangaben
Hauptverfasser: Felbur, Rafal (VerfasserIn) , Meelen, Marieke (VerfasserIn) , Vierthaler, Paul (VerfasserIn)
Dokumenttyp: Article (Journal)
Sprache:Englisch
Veröffentlicht: 04 October 2022
In: Journal of open humanities data
Year: 2022, Jahrgang: 8, Pages: 1-14
ISSN:2059-481X
DOI:10.5334/johd.86
Online-Zugang:Verlag, kostenfrei, Volltext: https://doi.org/10.5334/johd.86
Verlag, kostenfrei, Volltext: https://openhumanitiesdata.metajnl.com/articles/10.5334/johd.86
Volltext
Verfasserangaben:Rafal Felbur, Marieke Meelen, Paul Vierthaler

MARC

LEADER 00000naa a2200000 c 4500
001 1945175486
003 DE-627
005 20251209170930.0
007 cr uuu---uuuuu
008 251209s2022 xx |||||o 00| ||eng c
024 7 |a 10.5334/johd.86  |2 doi 
035 |a (DE-627)1945175486 
035 |a (DE-599)KXP1945175486 
040 |a DE-627  |b ger  |c DE-627  |e rda 
041 |a eng 
084 |a 13  |2 sdnb 
100 1 |a Felbur, Rafal  |e VerfasserIn  |0 (DE-588)1383853908  |0 (DE-627)1945175397  |4 aut 
245 1 0 |a Crosslinguistic semantic textual similarity of Buddhist Chinese and Classical Tibetan  |c Rafal Felbur, Marieke Meelen, Paul Vierthaler 
264 1 |c 04 October 2022 
300 |b Diagramme, Illustrationen 
300 |a 14 
336 |a Text  |b txt  |2 rdacontent 
337 |a Computermedien  |b c  |2 rdamedia 
338 |a Online-Ressource  |b cr  |2 rdacarrier 
500 |a Gesehen am 09.12.2025 
520 |a In this paper we present the first-ever procedure for identifying highly similar sequences of text in Chinese and Tibetan translations of Buddhist sūtra literature. We initially propose this procedure as an aid to scholars engaged in the philological study of Buddhist documents. We create a cross-lingual embedding space by taking the cosine similarity of average sequence vectors in order to produce unsupervised similar cross-linguistic parallel alignments at word, sentence, and even paragraph level. Initial results show that our method lays a solid foundation for the future development of a fully-fledged Information Retrieval tool for these (and potentially other) low-resource historical languages. 
700 1 |a Meelen, Marieke  |e VerfasserIn  |4 aut 
700 1 |a Vierthaler, Paul  |e VerfasserIn  |4 aut 
773 0 8 |i Enthalten in  |t Journal of open humanities data  |d London : Ubiquity Press, 2015  |g 8(2022), Seite 1-14  |h Online-Ressource  |w (DE-627)1668580926  |w (DE-600)2976820-2  |x 2059-481X  |7 nnas  |a Crosslinguistic semantic textual similarity of Buddhist Chinese and Classical Tibetan 
773 1 8 |g volume:8  |g year:2022  |g pages:1-14  |g extent:14  |a Crosslinguistic semantic textual similarity of Buddhist Chinese and Classical Tibetan 
856 4 0 |u https://doi.org/10.5334/johd.86  |x Verlag  |x Resolving-System  |z kostenfrei  |3 Volltext  |7 0 
856 4 0 |u https://openhumanitiesdata.metajnl.com/articles/10.5334/johd.86  |x Verlag  |z kostenfrei  |3 Volltext  |7 0 
951 |a AR 
992 |a 20251209 
993 |a Article 
994 |a 2022 
998 |g 1383853908  |a Felbur, Rafal  |m 1383853908:Felbur, Rafal  |d 700000  |d 728300  |e 700000PF1383853908  |e 728300PF1383853908  |k 0/700000/  |k 1/700000/728300/  |p 1  |x j  |y j 
999 |a KXP-PPN1945175486  |e 4824944163 
BIB |a Y 
SER |a journal 
JSO |a {"physDesc":[{"noteIll":"Diagramme, Illustrationen","extent":"14 S."}],"relHost":[{"title":[{"subtitle":"JOHD","title":"Journal of open humanities data","title_sort":"Journal of open humanities data"}],"titleAlt":[{"title":"JOHD"}],"part":{"text":"8(2022), Seite 1-14","volume":"8","extent":"14","year":"2022","pages":"1-14"},"pubHistory":["Volume 1 (2015)-"],"recId":"1668580926","language":["eng"],"disp":"Crosslinguistic semantic textual similarity of Buddhist Chinese and Classical TibetanJournal of open humanities data","type":{"bibl":"periodical","media":"Online-Ressource"},"id":{"issn":["2059-481X"],"eki":["1668580926"],"zdb":["2976820-2"]},"origin":[{"publisher":"Ubiquity Press","dateIssuedDisp":"[2015]-","publisherPlace":"London"}],"physDesc":[{"extent":"Online-Ressource"}]}],"name":{"displayForm":["Rafal Felbur, Marieke Meelen, Paul Vierthaler"]},"origin":[{"dateIssuedKey":"2022","dateIssuedDisp":"04 October 2022"}],"id":{"eki":["1945175486"],"doi":["10.5334/johd.86"]},"note":["Gesehen am 09.12.2025"],"type":{"media":"Online-Ressource","bibl":"article-journal"},"language":["eng"],"recId":"1945175486","person":[{"display":"Felbur, Rafal","roleDisplay":"VerfasserIn","role":"aut","family":"Felbur","given":"Rafal"},{"role":"aut","roleDisplay":"VerfasserIn","display":"Meelen, Marieke","given":"Marieke","family":"Meelen"},{"given":"Paul","family":"Vierthaler","role":"aut","roleDisplay":"VerfasserIn","display":"Vierthaler, Paul"}],"title":[{"title":"Crosslinguistic semantic textual similarity of Buddhist Chinese and Classical Tibetan","title_sort":"Crosslinguistic semantic textual similarity of Buddhist Chinese and Classical Tibetan"}]} 
SRT |a FELBURRAFACROSSLINGU0420