Explaining Classes through Stable Word Attributions
| dc.contributor.author | Rönnqvist Samuel | |
| dc.contributor.author | Myntti Amanda | |
| dc.contributor.author | Kyröläinen Aki-Juhani | |
| dc.contributor.author | Ginter Filip | |
| dc.contributor.author | Laippala Veronika | |
| dc.contributor.organization | fi=data-analytiikka|en=Data-analytiikka| | |
| dc.contributor.organization | fi=kieli- ja käännöstieteiden laitos|en=School of Languages and Translation Studies| | |
| dc.contributor.organization-code | 1.2.246.10.2458963.20.56461112866 | |
| dc.contributor.organization-code | 1.2.246.10.2458963.20.68940835793 | |
| dc.contributor.organization-code | 2602100 | |
| dc.converis.publication-id | 176874206 | |
| dc.converis.url | https://research.utu.fi/converis/portal/Publication/176874206 | |
| dc.date.accessioned | 2025-08-28T02:43:05Z | |
| dc.date.available | 2025-08-28T02:43:05Z | |
| dc.description.abstract | Input saliency methods have recently become a popular tool for explaining predictions of deep learning models in NLP. Nevertheless, there has been little work investigating methods for aggregating prediction-level explanations to the class level, nor has a framework for evaluating such class explanations been established. We explore explanations based on XLM-R and the Integrated Gradients input attribution method, and propose 1) the Stable Attribution Class Explanation method (SACX) to extract keyword lists of classes in text classification tasks, and 2) a framework for the systematic evaluation of the keyword lists. We find that explanations of individual predictions are prone to noise, but that stable explanations can be effectively identified through repeated training and explanation. We evaluate on web register data and show that the class explanations are linguistically meaningful and distinguishing of the classes. | |
| dc.format.pagerange | 1063 | |
| dc.format.pagerange | 1074 | |
| dc.identifier.isbn | 978-1-955917-25-4 | |
| dc.identifier.jour-issn | 0736-587X | |
| dc.identifier.olddbid | 209575 | |
| dc.identifier.oldhandle | 10024/192602 | |
| dc.identifier.uri | https://www.utupub.fi/handle/11111/47850 | |
| dc.identifier.url | https://aclanthology.org/2022.findings-acl.85 | |
| dc.identifier.urn | URN:NBN:fi-fe2022112968032 | |
| dc.language.iso | en | |
| dc.okm.affiliatedauthor | Rönnqvist, Samuel | |
| dc.okm.affiliatedauthor | Myntti, Amanda | |
| dc.okm.affiliatedauthor | Kyröläinen, Aki | |
| dc.okm.affiliatedauthor | Ginter, Filip | |
| dc.okm.affiliatedauthor | Laippala, Veronika | |
| dc.okm.discipline | 113 Computer and information sciences | en_GB |
| dc.okm.discipline | 6121 Languages | en_GB |
| dc.okm.discipline | 113 Tietojenkäsittely ja informaatiotieteet | fi_FI |
| dc.okm.discipline | 6121 Kielitieteet | fi_FI |
| dc.okm.internationalcopublication | not an international co-publication | |
| dc.okm.internationality | International publication | |
| dc.okm.type | A4 Conference Article | |
| dc.publisher.country | United States | en_GB |
| dc.publisher.country | Yhdysvallat (USA) | fi_FI |
| dc.publisher.country-code | US | |
| dc.relation.conference | Annual Meeting of the Association for Computational Linguistics | |
| dc.relation.doi | 10.18653/v1/2022.findings-acl.85 | |
| dc.relation.ispartofjournal | Annual Meeting of the Association for Computational Linguistics | |
| dc.relation.ispartofseries | Annual Meeting of the Association for Computational Linguistics | |
| dc.relation.volume | 60 | |
| dc.source.identifier | https://www.utupub.fi/handle/10024/192602 | |
| dc.title | Explaining Classes through Stable Word Attributions | |
| dc.title.book | The 60th Annual Meeting of the Association for Computational Linguistics: Findings of ACL 2022 | |
| dc.year.issued | 2022 |
Tiedostot
1 - 1 / 1
Ladataan...
- Name:
- Explaining classes through Stable WOrld Attributions.pdf
- Size:
- 5.07 MB
- Format:
- Adobe Portable Document Format