Text analysis of handwritten production deviations

Kangas, Kaarina

Text analysis of handwritten production deviations

dc.contributor.author	Kangas, Kaarina
dc.contributor.department	fi=Tietotekniikan laitos\|en=Department of Computing\|
dc.contributor.faculty	fi=Teknillinen tiedekunta\|en=Faculty of Technology\|
dc.contributor.studysubject	fi=Tietotekniikka\|en=Information and Communication Technology\|
dc.date.accessioned	2021-06-17T21:01:39Z
dc.date.available	2021-06-17T21:01:39Z
dc.date.issued	2021-06-14
dc.description.abstract	Companies want to understand the latest trends and summarize product status or public opinion based on social media data. Because data is rich and very diverse, there has been a need to create automated and real-time opinion polling and data mining. This need has contributed to the huge popularity of text analysis and at the same time the development and use of it is being applied to more and more industries. Not just for evaluating consumer feedback, for example. Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence which is focused to enable computers to understand and interpret human language. Its goal and strength is specifically to program computers to process and analyze large amounts of natural language. NLP technology can extract data accurately from text and classify and organize data. Using machine learning methods makes text analysis much faster and more efficient than manual word processing. The methods can be used to reduce labor costs and speed up the processing of texts without compromising on quality. The main focus of the thesis is to study the textual material received from the client and to develop a prediction model based on it using natural language processing (NLP) techniques. As a research strategy has been used a case study. The obtained text data, sentences about 9000, are from the period 2016/11-2018/9 from the production deviations observed in the welding and assembly process. Text sentences, i.e. user comments, were available at all stages from the detection of a deviation to its solution. This study has focused on the first observational comment written on the deviation. Based on them, a predictive model has been trained that can predict based on the given first comment, what can be the root cause of the deviation. The research material has been analyzed using both traditional machine learning methods and more advanced deep learning methods, pre-trained FinBERT and multilingual BERT. The accuracy of the model has been a key measure of the superiority of the model. The result was a reliable prediction model that can be used to predict when a deviation falls into class 100 (missing part) or class 200 (other deviations). The best accuracy of the traditional machine learning model was 85.7 % and of the transformer model was 82.6 %. The most common word in the all Finnish sentences was "puuttua" in different forms.
dc.format.extent	70
dc.identifier.olddbid	169199
dc.identifier.oldhandle	10024/152320
dc.identifier.uri	https://www.utupub.fi/handle/11111/14838
dc.identifier.urn	URN:NBN:fi-fe2021061738440
dc.language.iso	eng
dc.rights	fi=Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.\|en=This publication is copyrighted. You may download, display and print it for Your own personal use. Commercial use is prohibited.\|
dc.rights.accessrights	avoin
dc.source.identifier	https://www.utupub.fi/handle/10024/152320
dc.subject	NLP, text analysis, machine learning, transformer
dc.title	Text analysis of handwritten production deviations
dc.type.ontasot	fi=Pro gradu -tutkielma\|en=Master's thesis\|

Tiedostot

Näytetään 1 - 1 / 1

Name:: Thesis Kaarina Kangas.pdf
Size:: 1.1 MB
Format:: Adobe Portable Document Format

Lataa

Kokoelmat

Pro gradu -tutkielmat ja diplomityöt sekä syventävien opintojen opinnäytetyöt (kokotekstit)