A Resource-Efficient Codebook-Driven Semantic Structuring Pipeline for Human-AI Dialogue in Ambient Intelligent Systems

dc.contributor.authorAdeseye, Aisvarya
dc.contributor.authorIsoaho, Jouni
dc.contributor.authorVirtanen, Seppo
dc.contributor.authorMohammad, Tahir
dc.contributor.organizationfi=kyberturvallisuusteknologia|en=Cyber Security Engineering|
dc.contributor.organization-code1.2.246.10.2458963.20.28753843706
dc.converis.publication-id526472743
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/526472743
dc.date.accessioned2026-06-10T20:12:16Z
dc.description.abstract<p>Human–AI dialogue in ambient intelligent systems is increasingly relying on large language models (LLMs). When questions are generated dynamically to enable personalized and context-aware interactions, variations in phrasing and topical focus exist between conversations. Without structured organization, which is often extremely resource-intensive, conversational data remains fragmented and cannot be reliably used for systematic analysis or reporting. This study proposes a semantic structuring pipeline to map LLM-generated questions to shared codes, sub-themes, and themes using a predefined codebook. This multi-stage pipeline applies semantic screening, factor-based scoring, mathematical aggregation, and validation checks, supported by locally deployed LLMs and manual confirmation. The pipeline was evaluated on 6,030 question–response pairs collected from dynamic interviews across three research objectives. The framework achieved an overall mapping accuracy of 97% while reducing hallucinated semantic matches to 1.2% through layered validation. The results indicate that the framework effectively reduces hallucinated matches and improves mapping accuracy while remaining computationally efficient for private local deployment.<br></p>
dc.format.pagerange559
dc.format.pagerange552
dc.identifier.urihttps://www.utupub.fi/handle/11111/61692
dc.identifier.urlhttps://doi.org/10.1016/j.procs.2026.04.070
dc.identifier.urnURN:NBN:fi-fe2026061066543
dc.language.isoen
dc.okm.affiliatedauthorAdeseye, Aisvarya
dc.okm.affiliatedauthorIsoaho, Jouni
dc.okm.affiliatedauthorVirtanen, Seppo
dc.okm.affiliatedauthorMohammad, Tahir
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.discipline213 Electronic, automation and communications engineering, electronicsen_GB
dc.okm.discipline213 Sähkö-, automaatio- ja tietoliikennetekniikka, elektroniikkafi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA4 Conference Article
dc.publisher.countryNetherlandsen_GB
dc.publisher.countryAlankomaatfi_FI
dc.publisher.country-codeNL
dc.relation.conferenceInternational Conference on Ambient Systems, Networks and Technologies Networks
dc.relation.doi10.1016/j.procs.2026.04.070
dc.relation.ispartofjournalProcedia Computer Science
dc.relation.volume280
dc.titleA Resource-Efficient Codebook-Driven Semantic Structuring Pipeline for Human-AI Dialogue in Ambient Intelligent Systems
dc.title.bookThe 17th International Conference on Ambient Systems, Networks and Technologies Networks (ANT)/ the 9th International Conference on Emerging Data and Industry 4.0 (EDI40)
dc.year.issued2026

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
1-s2.0-S1877050926010835-main.pdf
Size:
461.89 KB
Format:
Adobe Portable Document Format