Text analysis of handwritten production deviations
| dc.contributor.author | Kangas, Kaarina | |
| dc.contributor.department | fi=Tietotekniikan laitos|en=Department of Computing| | |
| dc.contributor.faculty | fi=Teknillinen tiedekunta|en=Faculty of Technology| | |
| dc.contributor.studysubject | fi=Tietotekniikka|en=Information and Communication Technology| | |
| dc.date.accessioned | 2021-06-17T21:01:39Z | |
| dc.date.available | 2021-06-17T21:01:39Z | |
| dc.date.issued | 2021-06-14 | |
| dc.description.abstract | Companies want to understand the latest trends and summarize product status or public opinion based on social media data. Because data is rich and very diverse, there has been a need to create automated and real-time opinion polling and data mining. This need has contributed to the huge popularity of text analysis and at the same time the development and use of it is being applied to more and more industries. Not just for evaluating consumer feedback, for example. Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence which is focused to enable computers to understand and interpret human language. Its goal and strength is specifically to program computers to process and analyze large amounts of natural language. NLP technology can extract data accurately from text and classify and organize data. Using machine learning methods makes text analysis much faster and more efficient than manual word processing. The methods can be used to reduce labor costs and speed up the processing of texts without compromising on quality. The main focus of the thesis is to study the textual material received from the client and to develop a prediction model based on it using natural language processing (NLP) techniques. As a research strategy has been used a case study. The obtained text data, sentences about 9000, are from the period 2016/11-2018/9 from the production deviations observed in the welding and assembly process. Text sentences, i.e. user comments, were available at all stages from the detection of a deviation to its solution. This study has focused on the first observational comment written on the deviation. Based on them, a predictive model has been trained that can predict based on the given first comment, what can be the root cause of the deviation. The research material has been analyzed using both traditional machine learning methods and more advanced deep learning methods, pre-trained FinBERT and multilingual BERT. The accuracy of the model has been a key measure of the superiority of the model. The result was a reliable prediction model that can be used to predict when a deviation falls into class 100 (missing part) or class 200 (other deviations). The best accuracy of the traditional machine learning model was 85.7 % and of the transformer model was 82.6 %. The most common word in the all Finnish sentences was "puuttua" in different forms. | |
| dc.format.extent | 70 | |
| dc.identifier.olddbid | 169199 | |
| dc.identifier.oldhandle | 10024/152320 | |
| dc.identifier.uri | https://www.utupub.fi/handle/11111/14838 | |
| dc.identifier.urn | URN:NBN:fi-fe2021061738440 | |
| dc.language.iso | eng | |
| dc.rights | fi=Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.|en=This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.| | |
| dc.rights.accessrights | avoin | |
| dc.source.identifier | https://www.utupub.fi/handle/10024/152320 | |
| dc.subject | NLP, text analysis, machine learning, transformer | |
| dc.title | Text analysis of handwritten production deviations | |
| dc.type.ontasot | fi=Pro gradu -tutkielma|en=Master's thesis| |
Tiedostot
1 - 1 / 1