Can text and data mining exceptions and synthetic data training mitigate copyright-related concerns in generative AI?

dc.contributor.authorManteghi, Maryna
dc.contributor.organizationfi=oikeustiede|en=Laws|
dc.contributor.organization-code1.2.246.10.2458963.20.53046050752
dc.converis.publication-id457845589
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/457845589
dc.date.accessioned2025-08-27T22:10:15Z
dc.date.available2025-08-27T22:10:15Z
dc.description.abstract<p>Rapidly emerging generative artificial intelligence (GenAI) models stand at the epicentre of current public discourse. They demonstrate impressive abilities to generate various types of data promptly and cost-effectively. However, AI developers need to train their systems on massive volumes of data which is usually copyrighted. Therefore, the growth of copyright-related concerns in the field of GenAI comes as no surprise. The study introduces two solutions which could mitigate the tension between copyright holders and AI developers, one legal (text and data mining (TDM) exceptions of the CDSM Directive) and one technical (synthetic data), highlighting the promises and challenges of both. First, the article will discuss the capability of TDM exceptions to facilitate the fundamental right to information and the freedom of research in the context of AI development. Next, the paper will analyse how providers of GenAI models can leverage synthetic data to comply with copyright law while training their systems and what risks might be associated with this approach. The findings of this study will indicate what issues, in both legal and technical spheres, should be addressed to ensure a balance of powers in the digital environment and effective functionality of the EU AI sector.</p>
dc.identifier.eissn1757-997X
dc.identifier.jour-issn1757-9961
dc.identifier.olddbid201746
dc.identifier.oldhandle10024/184773
dc.identifier.urihttps://www.utupub.fi/handle/11111/49242
dc.identifier.urlhttps://www.tandfonline.com/doi/full/10.1080/17579961.2024.2392928
dc.identifier.urnURN:NBN:fi-fe2025082785494
dc.language.isoen
dc.okm.affiliatedauthorManteghi, Maryna
dc.okm.discipline513 Lawen_GB
dc.okm.discipline513 Oikeustiedefi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherTaylor and Francis Ltd.
dc.publisher.countryUnited Kingdomen_GB
dc.publisher.countryBritanniafi_FI
dc.publisher.country-codeGB
dc.relation.doi10.1080/17579961.2024.2392928
dc.relation.ispartofjournalLaw, innovation and technology
dc.source.identifierhttps://www.utupub.fi/handle/10024/184773
dc.titleCan text and data mining exceptions and synthetic data training mitigate copyright-related concerns in generative AI?
dc.year.issued2024

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
Can text and data mining exceptions and synthetic data training mitigate copyright-related concerns in generative AI .pdf
Size:
937.49 KB
Format:
Adobe Portable Document Format