Universal dependencies for Persian

dc.contributor.authorSeraji Mojgan
dc.contributor.authorGinter Filip
dc.contributor.authorNivre Joakim
dc.contributor.organizationfi=kieli- ja puheteknologia|en=Language and Speech Technology|
dc.contributor.organization-code1.2.246.10.2458963.20.47465613983
dc.converis.publication-id29509325
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/29509325
dc.date.accessioned2022-10-28T13:09:59Z
dc.date.available2022-10-28T13:09:59Z
dc.description.abstract<p>The Persian Universal Dependency Treebank (Persian UD) is a recent effort of treebanking Persian with Universal Dependencies (UD), an ongoing project that designs unified and cross-linguistically valid grammatical representations including part-of-speech tags, morphological features, and dependency relations. The Persian UD is the converted version of the Uppsala Persian Dependency Treebank (UPDT) to the universal dependencies framework and consists of nearly 6,000 sentences and 152,871 word tokens with an average sentence length of 25 words. In addition to the universal dependencies syntactic annotation guidelines, the two treebanks differ in tokenization. All words containing unsegmented clitics (pronominal and copula clitics) annotated with complex labels in the UPDT have been separated from the clitics and appear with distinct labels in the Persian UD. The treebank has its original syntactic annotation scheme based on Stanford Typed Dependencies. In this paper, we present the approaches taken in the development of the Persian UD.<br /></p>
dc.format.pagerange2361
dc.format.pagerange2365
dc.identifier.isbn978-2-9517408-9-1
dc.identifier.olddbid180174
dc.identifier.oldhandle10024/163268
dc.identifier.urihttps://www.utupub.fi/handle/11111/38113
dc.identifier.urlhttp://www.lrec-conf.org/proceedings/lrec2016/index.html
dc.identifier.urnURN:NBN:fi-fe2021042718705
dc.language.isoen
dc.okm.affiliatedauthorGinter, Filip
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline6121 Languagesen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.discipline6121 Kielitieteetfi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA4 Conference Article
dc.publisher.countryFranceen_GB
dc.publisher.countryRanskafi_FI
dc.publisher.country-codeFR
dc.relation.conferenceInternational Conference on Language Resources and Evaluation (LREC)
dc.source.identifierhttps://www.utupub.fi/handle/10024/163268
dc.titleUniversal dependencies for Persian
dc.title.bookProceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016)
dc.year.issued2016

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
697_Paper.pdf
Size:
125.85 KB
Format:
Adobe Portable Document Format
Description:
Publisher's PDF