Donate Speech: Collecting and Sharing a Large-Scale Speech Database for Social Sciences, Humanities and Artificial Intelligence Research and Innovation

dc.contributor.authorLindén Krister
dc.contributor.authorJauhiainen Tommi
dc.contributor.authorLennes Mietta
dc.contributor.authorKurimo Mikko
dc.contributor.authorRossi Aleksi
dc.contributor.authorKurki Tommi
dc.contributor.authorPitkänen Olli.
dc.contributor.organizationfi=kotimaiset kielet ja niiden sukukielet|en=Finnish, Finno-Ugric and Scandinavian languages|
dc.contributor.organization-code1.2.246.10.2458963.20.59108485091
dc.converis.publication-id176591290
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/176591290
dc.date.accessioned2025-08-28T00:31:31Z
dc.date.available2025-08-28T00:31:31Z
dc.description.abstract<p> <span>The Donate Speech campaign aimed to collect 10,000 hours of ordinary, </span><span>casual Finnish speech to be used for studying language as well as for develop</span><span>-</span><span>ing technology and services that can be readily used in the languages spoken in </span><span>Finland. In this project, particular attention has been devoted to allowing for both </span><span>academic and commercial use of the material. Even though this ambitious target </span><span>currently seems likely to evade us, the Donate Speech campaign has managed </span><span>to amass an extensive resource of more than 4,000 hours of Finnish colloquial </span><span>speech comprising more than 220,000 speech recordings by more than 25,000 </span><span>speakers from all over Finland in just a few months.</span><span></span><br></p><p><span>Keywords:</span><span> speech resources, colloquial speech, large-scale data collection, aca</span><span>-</span><span>demic and commercial use</span> <br></p>
dc.format.pagerange481
dc.format.pagerange510
dc.identifier.eisbn978-3-11-076737-7
dc.identifier.isbn978-3-11-076734-6
dc.identifier.olddbid205871
dc.identifier.oldhandle10024/188898
dc.identifier.urihttps://www.utupub.fi/handle/11111/35559
dc.identifier.urlhttps://doi.org/10.1515/9783110767377-019
dc.identifier.urnURN:NBN:fi-fe2022102462987
dc.language.isoen
dc.okm.affiliatedauthorKurki, Tommi
dc.okm.discipline6121 Languagesen_GB
dc.okm.discipline6121 Kielitieteetfi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA3 Book
dc.publisherDe Gruyter
dc.publisher.countryGermanyen_GB
dc.publisher.countryUnited Statesen_GB
dc.publisher.countrySaksafi_FI
dc.publisher.countryYhdysvallat (USA)fi_FI
dc.publisher.country-codeDE
dc.publisher.country-codeUS
dc.publisher.isbn978-3-11; 978-3-484; 978-3-597; 978-3-598; 978-3-7940; 978-3-11-025877-6
dc.publisher.placeBerlin & Boston
dc.relation.doi10.1515/9783110767377-019
dc.source.identifierhttps://www.utupub.fi/handle/10024/188898
dc.titleDonate Speech: Collecting and Sharing a Large-Scale Speech Database for Social Sciences, Humanities and Artificial Intelligence Research and Innovation
dc.title.bookCLARIN: The Infrastructure for Language Resources
dc.year.issued2022

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
10.1515_9783110767377-019.pdf
Size:
1.89 MB
Format:
Adobe Portable Document Format