Hyppää sisältöön
    • Suomeksi
    • In English
  • Suomeksi
  • In English
  • Kirjaudu
Näytä aineisto 
  •   Etusivu
  • 1. Kirjat ja opinnäytteet
  • Väitöskirjat
  • Näytä aineisto
  •   Etusivu
  • 1. Kirjat ja opinnäytteet
  • Väitöskirjat
  • Näytä aineisto
JavaScript is disabled for your browser. Some features of this site may not work without it.

Algorithmic Techniques in Gene Expression Processing. From Imputation to Visualization

Tuikkala, Johannes (2014-11-20)

Algorithmic Techniques in Gene Expression Processing. From Imputation to Visualization

Tuikkala, Johannes
(20.11.2014)
Katso/Avaa
TUCSDissertationD185.pdf (1.655Mb)
Lataukset: 

Turku Centre for Computer Science
Näytä kaikki kuvailutiedot
Julkaisun pysyvä osoite on:
https://urn.fi/URN:ISBN:978-952-12-3126-1

Kuvaus

Siirretty Doriasta
ei tietoa saavutettavuudesta
Tiivistelmä
The amount of biological data has grown exponentially in recent decades. Modern biotechnologies, such as microarrays and next-generation sequencing, are capable to produce massive amounts of biomedical data in a single experiment. As the amount of the data is rapidly growing there is an urgent need for reliable computational methods for analyzing and visualizing it. This thesis addresses this need by studying how to efficiently and reliably analyze and visualize high-dimensional data, especially that obtained from gene expression microarray experiments.

First, we will study the ways to improve the quality of microarray data by replacing (imputing) the missing data entries with the estimated values for these entries. Missing value imputation is a method which is commonly used to make the original incomplete data complete, thus making it easier to be analyzed with statistical and computational methods. Our novel approach was to use curated external biological information as a guide for the missing
value imputation.

Secondly, we studied the effect of missing value imputation on the downstream data analysis methods like clustering. We compared multiple recent imputation algorithms against 8 publicly available microarray data sets. It was observed that the missing value imputation indeed is a rational way to improve the quality of biological data. The research revealed differences between the clustering results obtained with different imputation methods. On most data sets, the simple and fast k-NN imputation was good enough, but there were also needs for more advanced imputation methods, such as Bayesian Principal Component Algorithm (BPCA).

Finally, we studied the visualization of biological network data. Biological interaction networks are examples of the outcome of multiple biological experiments such as using the gene microarray techniques. Such networks are typically very large and highly connected, thus there is a need for fast algorithms for producing visually pleasant layouts. A computationally efficient way to produce layouts of large biological interaction networks was developed. The algorithm uses multilevel optimization within the regular force directed graph layout algorithm.
Kokoelmat
  • Väitöskirjat [2951]

Turun yliopiston kirjasto | Turun yliopisto
julkaisut@utu.fi | Tietosuoja | Saavutettavuusseloste
 

 

Tämä kokoelma

JulkaisuajatTekijätNimekkeetAsiasanatTiedekuntaLaitosOppiaineYhteisöt ja kokoelmat

Omat tiedot

Kirjaudu sisäänRekisteröidy

Turun yliopiston kirjasto | Turun yliopisto
julkaisut@utu.fi | Tietosuoja | Saavutettavuusseloste