Designing fair and compliant AI: Evaluating bias mitigation methods under data minimisation constraints : A case study on the COMPAS dataset
| dc.contributor.author | Bekkers, Johannes | |
| dc.contributor.department | fi=Johtamisen ja yrittäjyyden laitos|en=Department of Management and Entrepreneurship| | |
| dc.contributor.faculty | fi=Turun kauppakorkeakoulu|en=Turku School of Economics| | |
| dc.contributor.studysubject | fi=Tietojärjestelmätiede|en=Information Systems Science| | |
| dc.date.accessioned | 2025-10-20T21:04:14Z | |
| dc.date.available | 2025-10-20T21:04:14Z | |
| dc.date.issued | 2025-08-18 | |
| dc.description.abstract | The increasing integration of machine learning into high-stakes decision-making has intensified concerns about fairness, accountability, and compliance with data protection regulations. A central tension arises between the General Data Protection Regulation’s data minimisation principle, which restricts access to sensitive attributes such as race, and the practical need for such data in bias mitigation and fairness auditing. This study examines whether algorithmic fairness can be achieved when bias mitigation techniques are applied under data minimisation constraints. Using the publicly available COMPAS recidivism dataset as a case study, two interpretable and widely used classifiers, Logistic Regression and Random Forest, were trained on both a dataset that included race and a data-minimised version that excluded it from model training. Models were evaluated using a variety of well-established performance and fairness metrics. In the full-data setting, bias mitigation methods requiring sensitive attributes improved fairness while maintaining performance. In the data-minimised setting, baseline models exhibited higher fairness but slightly reduced predictive performance. Mitigation options were restricted to a narrow range of post-hoc interventions, with effectiveness concentrated in a single method. This setting nonetheless achieved stronger fairness improvements than the full-data case, despite the limited toolkit. The findings highlight a trade-off in which data minimisation can provide a fairer starting point but narrows the range and reliability of available bias mitigation techniques, creating dependence on specific interventions. These insights are relevant to policymakers and practitioners seeking to balance privacy compliance with robust algorithmic fairness. | |
| dc.format.extent | 146 | |
| dc.identifier.olddbid | 211303 | |
| dc.identifier.oldhandle | 10024/194323 | |
| dc.identifier.uri | https://www.utupub.fi/handle/11111/12660 | |
| dc.identifier.urn | URN:NBN:fi-fe20251020102286 | |
| dc.language.iso | eng | |
| dc.rights | fi=Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.|en=This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.| | |
| dc.rights.accessrights | avoin | |
| dc.source.identifier | https://www.utupub.fi/handle/10024/194323 | |
| dc.subject | Machine Learning, Algorithmic Fairness, GDPR, Bias Mitigation, Data Minimisation, COMPAS, Ethical AI | |
| dc.title | Designing fair and compliant AI: Evaluating bias mitigation methods under data minimisation constraints : A case study on the COMPAS dataset | |
| dc.type.ontasot | fi=Pro gradu -tutkielma|en=Master's thesis| |
Tiedostot
1 - 1 / 1
Ladataan...
- Name:
- Designing_fair_and_compliant_AI.pdf
- Size:
- 1.47 MB
- Format:
- Adobe Portable Document Format