A global and interoperable dataset of linguistic distributions derived from the Atlas of the World’s Languages

dc.contributor.authorRanacher, Peter
dc.contributor.authorForkel, Robert
dc.contributor.authorEfrat-Kowalsky, Nour
dc.contributor.authorUrban, Matthias
dc.contributor.authorHehli, Antonia
dc.contributor.authorFranz, Micha
dc.contributor.authorBiland, Gregory
dc.contributor.authorKreienbühl, Aaron
dc.contributor.authorHermida Rodríguez, Alba
dc.contributor.authorAzevedo, Matheus
dc.contributor.authorRomar, Martijn
dc.contributor.authorKlaussova, Andrea
dc.contributor.authorTakahashi, Takuya
dc.contributor.authorNeureiter, Nico
dc.contributor.authorvan Gijn, Rik
dc.contributor.authorRoose, Meeli
dc.contributor.authorVesakoski, Outi
dc.contributor.authorWeibel, Robert
dc.contributor.authorKaiping, Gereon
dc.contributor.authorNorder, Sietze
dc.contributor.organizationfi=kotimaiset kielet ja niiden sukukielet|en=Finnish, Finno-Ugric and Scandinavian languages|
dc.contributor.organizationfi=maantiede|en=Geography |
dc.contributor.organization-code1.2.246.10.2458963.20.17647764921
dc.contributor.organization-code1.2.246.10.2458963.20.59108485091
dc.converis.publication-id500015565
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/500015565
dc.date.accessioned2026-01-21T14:52:08Z
dc.date.available2026-01-21T14:52:08Z
dc.description.abstract<p>Asher and Moseley’s <em>Atlas of the World’s Languages</em> illustrates the past and present spatial distribution of human languages across more than 100 maps. While the <em>Atlas</em> is an impressive resource, its data are not readily accessible for research. Language areas are presented as printed maps and referenced by name, rather than as digital spatial objects linked to a standardised language catalogue. To address these limitations, we present a digital dataset derived from the <em>Atlas</em>. We georeferenced the map images, digitised the language polygons in a Geographic Information System (GIS), and linked each polygon to a Glottocode — a unique identifier for languages and language varieties. Following the FAIR principles, we provide the data as a faithful digital replication of the Atlas (comprising 6,992 distinct language areas) and in enriched, aggregated versions for contemporary and traditional languages. The datasets capture the spatial distribution of human languages as depicted in the <em>Atlas</em>, with each polygon linked to an unambiguous identifier, enabling computational analyses of the origins, distribution, and drivers of global linguistic diversity.<br></p>
dc.identifier.eissn2052-4463
dc.identifier.jour-issn2052-4463
dc.identifier.olddbid213813
dc.identifier.oldhandle10024/196831
dc.identifier.urihttps://www.utupub.fi/handle/11111/55966
dc.identifier.urlhttps://doi.org/10.1038/s41597-025-05828-6
dc.identifier.urnURN:NBN:fi-fe202601217046
dc.language.isoen
dc.okm.affiliatedauthorRoose, Meeli
dc.okm.affiliatedauthorVesakoski, Outi
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline1172 Environmental sciencesen_GB
dc.okm.discipline6121 Languagesen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.discipline1172 Ympäristötiedefi_FI
dc.okm.discipline6121 Kielitieteetfi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 DataArticle
dc.publisherSpringer Nature
dc.publisher.countryUnited Kingdomen_GB
dc.publisher.countryBritanniafi_FI
dc.publisher.country-codeGB
dc.relation.articlenumber1466
dc.relation.doi10.1038/s41597-025-05828-6
dc.relation.ispartofjournalScientific Data
dc.relation.volume12
dc.source.identifierhttps://www.utupub.fi/handle/10024/196831
dc.titleA global and interoperable dataset of linguistic distributions derived from the Atlas of the World’s Languages
dc.year.issued2025

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
s41597-025-05828-6.pdf
Size:
1.24 MB
Format:
Adobe Portable Document Format