On variability in the identification and labelling of disfluencies — preliminary results from 23 annotations of the same data

dc.contributor.authorTrouvain, Jürgen
dc.contributor.authorCrible, Ludivine
dc.contributor.authorBelz, Malte
dc.contributor.authorBetz, Simon
dc.contributor.authorBeňuš, Štefan
dc.contributor.authorBaqué, Lorraine
dc.contributor.authorCantarutti, Marina
dc.contributor.authorDi Napoli, Jessica
dc.contributor.authorDidirková, Ivana
dc.contributor.authorMachuca, Maria
dc.contributor.authorMareková, Lucia
dc.contributor.authorNiculescu, Oana
dc.contributor.authorPeltonen, Pauliina
dc.contributor.authorPistono, Aurelie
dc.contributor.authorSchettino, Loredana
dc.contributor.authorSilber-Varod, Vered
dc.contributor.authorWilliams, Simon
dc.contributor.organizationfi=digitaalinen kielentutkimus, espanja, italia, kiina, ranska, saksa|en=Digital Language Studies, Chinese, French, German, Italian, Spanish|
dc.contributor.organization-code1.2.246.10.2458963.20.36764574459
dc.converis.publication-id499212588
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/499212588
dc.date.accessioned2026-01-21T12:19:53Z
dc.date.available2026-01-21T12:19:53Z
dc.description.abstract<p>This study provides a preliminary report on a large inter-annotator agreement experiment where 23 expert annotators from various research backgrounds identified and labelled disfluencies in the same speech sample. Each annotator was instructed to analyze the sample according to the framework (definitions, segmentation, labels, etc.) they typically use. The annotations were then processed and compared across three different dimensions: 1) the scope of the chosen typology and the definitions within, 2) the implementation of the typology in terms of annotation tiers and labels, and 3) the temporal alignment of the annotations. Preliminary findings reveal that there are substantial variations between annotators on various levels of annotation. The lack of a common standard becomes particularly evident in more complex segments, such as repairs.<br></p>
dc.format.pagerange57
dc.format.pagerange61
dc.identifier.jour-issn1990-9772
dc.identifier.olddbid212349
dc.identifier.oldhandle10024/195367
dc.identifier.urihttps://www.utupub.fi/handle/11111/51075
dc.identifier.urlhttps://www.isca-archive.org/tmp/diss_2025/trouvain25_diss.html
dc.identifier.urnURN:NBN:fi-fe202601216838
dc.language.isoen
dc.okm.affiliatedauthorPeltonen, Pauliina
dc.okm.discipline6121 Languagesen_GB
dc.okm.discipline6121 Kielitieteetfi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA4 Conference Article
dc.publisher.countryPortugalen_GB
dc.publisher.countryPortugalifi_FI
dc.publisher.country-codePT
dc.relation.conferenceDisfluency in Spontaneous Speech Workshop
dc.relation.doi10.21437/DiSS.2025-12
dc.relation.ispartofjournalInterspeech
dc.source.identifierhttps://www.utupub.fi/handle/10024/195367
dc.titleOn variability in the identification and labelling of disfluencies — preliminary results from 23 annotations of the same data
dc.title.bookProc. Disfluency in Spontaneous Speech (DiSS) Workshop 2025
dc.year.issued2025

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
trouvain25_diss.pdf
Size:
676.69 KB
Format:
Adobe Portable Document Format