Evaluation of variant calling tools for detecting structural variants using real and simulated genomic datasets

Pro gradu -tutkielma
Ladataan...
suljettu
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.

Verkkojulkaisu

DOI

Tiivistelmä

Structural variants are generally defined as DNA variations larger than 50bp. They have been recognized as the largest source of inter-individual genetic variation and shown to play an important role in human life. Genomic structural variants consist of various types and nowadays next generation sequencing makes it possible to screen these variants. To date, many variant calling tools have been published for this purpose with different underlying detection algorithms. However, there is a lack of information on the performance of these variant calling tools when used for calling structural variants. Therefore, five different state-of-the-art variant calling tools were comprehensively evaluated in this thesis. The results of the evaluation can help researchers to choose the best suitable variant calling tool for their specific data types and research questions. In summary, in the first three chapters, the biological and computational backgrounds of DNA sequencing technologies and the previous variant calling tool evaluation studies are reviewed. The fourth chapter introduces the materials and methods which were used in this thesis. The results of the variant calling tool evaluation are presented in the fifth chapter and the discussion and the conclusion are in sixth and seventh chapters. In this study, the performances of five open source and widely used variant calling tools, namely Pindel, ScanIndel, Fermikit, VarDict and VarScan were evaluated. The tools were evaluated using both real genomic data and simulated genomic data. The performance of the tools was measured using different metrics such as precision, recall, detected variant length, running time and similarities in variant calls between the tools. The results of this thesis indicate that there is no single “multipurpose” tool but instead different tools are good in detecting specific variant types of a specific size range.

item.page.okmtext