A Machine Learning Approach to Indoor Localization Data Mining
Lindqvist, Samuel (2020-11-25)
A Machine Learning Approach to Indoor Localization Data Mining
Lindqvist, Samuel
(25.11.2020)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
avoin
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe20201209100100
https://urn.fi/URN:NBN:fi-fe20201209100100
Tiivistelmä
Indoor positioning systems are increasingly commonplace in various environments and
produce large quantities of data. They are used in industrial applications, robotics,
asset and employee tracking just to name a few use cases. The growing amount of data
and the accelerating progress of machine learning opens up many new possibilities for
analyzing this data in ways that were not conceivable or relevant before. This paper
introduces connected concepts and implementations to answer question how this data
can be utilized. Data gathered in this thesis originates from an indoor positioning system
deployed in retail environment, but the discussed methods can be applied generally.
The issue will be approached by first introducing the concept of machine learning
and more generally, artificial intelligence, and how they work on a general level. A
deeper dive is done to subfields and algorithms that are relevant to the data mining task
at hand. Indoor positioning system basics are also shortly discussed to create a base understanding
on the realistic capabilities and constraints that these kinds of systems encase.
These methods and previous knowledge from literature are put to test with the
freshly gathered data. An algorithm based on existing example from literature was tested
and improved upon with the new data. A novel method to cluster and classify movement
patterns was introduced, utilizing deep learning to create embedded representations of the
trajectories in a more complex learning pipeline. This type of learning is often referred
to as deep clustering.
The results are promising and both of the methods produce useful high level representations
of the complex dataset that can help a human operator to discern the
relevant patterns from raw data and to be used as an input for subsequent supervised and
unsupervised learning steps. Several factors related to optimizing the learning pipeline,
such as regularization were also researched and the results presented as visualizations.
The research found that pipeline consisting of CNN-autoencoder followed by a classic
clustering algorithm such as DBSCAN produces useful results in the form of trajectory
clusters. Regularization such as L1 regression improves this performance.
The research done in this paper presents useful algorithms for processing raw, noisy
localization data from indoor environments that can be used for further implementations
in both industrial applications and academia.
produce large quantities of data. They are used in industrial applications, robotics,
asset and employee tracking just to name a few use cases. The growing amount of data
and the accelerating progress of machine learning opens up many new possibilities for
analyzing this data in ways that were not conceivable or relevant before. This paper
introduces connected concepts and implementations to answer question how this data
can be utilized. Data gathered in this thesis originates from an indoor positioning system
deployed in retail environment, but the discussed methods can be applied generally.
The issue will be approached by first introducing the concept of machine learning
and more generally, artificial intelligence, and how they work on a general level. A
deeper dive is done to subfields and algorithms that are relevant to the data mining task
at hand. Indoor positioning system basics are also shortly discussed to create a base understanding
on the realistic capabilities and constraints that these kinds of systems encase.
These methods and previous knowledge from literature are put to test with the
freshly gathered data. An algorithm based on existing example from literature was tested
and improved upon with the new data. A novel method to cluster and classify movement
patterns was introduced, utilizing deep learning to create embedded representations of the
trajectories in a more complex learning pipeline. This type of learning is often referred
to as deep clustering.
The results are promising and both of the methods produce useful high level representations
of the complex dataset that can help a human operator to discern the
relevant patterns from raw data and to be used as an input for subsequent supervised and
unsupervised learning steps. Several factors related to optimizing the learning pipeline,
such as regularization were also researched and the results presented as visualizations.
The research found that pipeline consisting of CNN-autoencoder followed by a classic
clustering algorithm such as DBSCAN produces useful results in the form of trajectory
clusters. Regularization such as L1 regression improves this performance.
The research done in this paper presents useful algorithms for processing raw, noisy
localization data from indoor environments that can be used for further implementations
in both industrial applications and academia.