Whole genome sequencing of patients with undiagnosed, suspected rare genetic disorders
Tursi, Amanda (2019-06-17)
Whole genome sequencing of patients with undiagnosed, suspected rare genetic disorders
Tursi, Amanda
(17.06.2019)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
suljettu
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2019092329502
https://urn.fi/URN:NBN:fi-fe2019092329502
Tiivistelmä
The high prevalence of undiagnosed diseases is a major health concern throughout the world. The low incidence rate of individual diseases has hindered research for the disorders that have already been described and it is estimated that there are thousands more diseases which have yet to even be named. Since the majority of these rare diseases are estimated to be genetic in origin, next-generation sequencing has become an important tool in diagnosing previously undiagnosable diseases. Whole genome sequencing (WGS) in particular is becoming increasingly useful and cost-effective in rare disease research.
This thesis explores the usage of whole genome sequencing in diagnosing patients with suspected rare genetic diseases. Twenty patients from Turku University Hospital had samples taken and sent out for sequencing. A pipeline was created to process the returned data and find possibly pathogenic variants. For each patient, the pipeline took an inputted FASTQ file and aligned the data to a reference genome. The resulting aligned and sorted BAM file was used to call single nucleotide variants and small insertions/deletions. After the variants were called, another program was used to annotate the variants with gene- and variant-specific information such as population frequency, predictive damaging scores, and conservation scores across species. Additional patient-phenotype specific annotations were then added. Finally, the annotated file was filtered to remove obviously benign variants. The final file outputted by the pipeline was manually examined to assess and classify the variants by pathogenic likelihood.
This WGS pipeline and analysis process resulted in several variants of uncertain significance that should be reassessed at a later date when more variant information is available. Additionally, possibly causative variants were found in four patients. These variants were classified as ‘likely pathogenic’ by following the American College of Medical Genetics and Genomics Standards and Guidelines for the Interpretation of Sequence Variants. Moreover, the diseases associated with these variants share phenotype similarities to those of the patients. Supplemental clinician assessment and tests, such as sequencing parental samples, is recommended to confirm or deny pathogenicity. Further efforts to improve the diagnostic yield of this group of patients are also proposed within this work.
This thesis explores the usage of whole genome sequencing in diagnosing patients with suspected rare genetic diseases. Twenty patients from Turku University Hospital had samples taken and sent out for sequencing. A pipeline was created to process the returned data and find possibly pathogenic variants. For each patient, the pipeline took an inputted FASTQ file and aligned the data to a reference genome. The resulting aligned and sorted BAM file was used to call single nucleotide variants and small insertions/deletions. After the variants were called, another program was used to annotate the variants with gene- and variant-specific information such as population frequency, predictive damaging scores, and conservation scores across species. Additional patient-phenotype specific annotations were then added. Finally, the annotated file was filtered to remove obviously benign variants. The final file outputted by the pipeline was manually examined to assess and classify the variants by pathogenic likelihood.
This WGS pipeline and analysis process resulted in several variants of uncertain significance that should be reassessed at a later date when more variant information is available. Additionally, possibly causative variants were found in four patients. These variants were classified as ‘likely pathogenic’ by following the American College of Medical Genetics and Genomics Standards and Guidelines for the Interpretation of Sequence Variants. Moreover, the diseases associated with these variants share phenotype similarities to those of the patients. Supplemental clinician assessment and tests, such as sequencing parental samples, is recommended to confirm or deny pathogenicity. Further efforts to improve the diagnostic yield of this group of patients are also proposed within this work.