Evaluation of tools for identifying large copy number variations from ultra-low-coverage whole-genome sequencing data

dc.contributor.authorSmolander Johannes
dc.contributor.authorKhan Sofia
dc.contributor.authorSingaravelu Kalaimathy
dc.contributor.authorKauko Leni
dc.contributor.authorLund Riikka J.
dc.contributor.authorLaiho Asta
dc.contributor.authorL. Elo Laura
dc.contributor.organizationfi=Turun biotiedekeskus|en=Turku Bioscience Centre|
dc.contributor.organizationfi=biolääketieteen laitos|en=Institute of Biomedicine|
dc.contributor.organizationfi=tietotekniikan laitos|en=Department of Computing|
dc.contributor.organization-code1.2.246.10.2458963.20.18586209670
dc.contributor.organization-code1.2.246.10.2458963.20.77952289591
dc.contributor.organization-code1.2.246.10.2458963.20.85312822902
dc.contributor.organization-code2609201
dc.contributor.organization-code2609210
dc.converis.publication-id56057103
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/56057103
dc.date.accessioned2022-10-28T13:53:16Z
dc.date.available2022-10-28T13:53:16Z
dc.description.abstract<div><h3>Background</h3><p>Detection of copy number variations (CNVs) from high-throughput next-generation whole-genome sequencing (WGS) data has become a widely used research method during the recent years. However, only a little is known about the applicability of the developed algorithms to ultra-low-coverage (0.0005–0.8×) data that is used in various research and clinical applications, such as digital karyotyping and single-cell CNV detection.</p><h3>Result</h3><p>Here, the performance of six popular read-depth based CNV detection algorithms (BIC-seq2, Canvas, CNVnator, FREEC, HMMcopy, and QDNAseq) was studied using ultra-low-coverage WGS data. Real-world array- and karyotyping kit-based validation were used as a benchmark in the evaluation. Additionally, ultra-low-coverage WGS data was simulated to investigate the ability of the algorithms to identify CNVs in the sex chromosomes and the theoretical minimum coverage at which these tools can accurately function. Our results suggest that while all the methods were able to detect large CNVs, many methods were susceptible to producing false positives when smaller CNVs (< 2 Mbp) were detected. There was also significant variability in their ability to identify CNVs in the sex chromosomes. Overall, BIC-seq2 was found to be the best method in terms of statistical performance. However, its significant drawback was by far the slowest runtime among the methods (> 3 h) compared with FREEC (~ 3 min), which we considered the second-best method.</p><h3>Conclusions</h3><p>Our comparative analysis demonstrates that CNV detection from ultra-low-coverage WGS data can be a highly accurate method for the detection of large copy number variations when their length is in millions of base pairs. These findings facilitate applications that utilize ultra-low-coverage CNV detection.</p></div>
dc.identifier.eissn1471-2164
dc.identifier.jour-issn1471-2164
dc.identifier.olddbid184978
dc.identifier.oldhandle10024/168072
dc.identifier.urihttps://www.utupub.fi/handle/11111/41867
dc.identifier.urnURN:NBN:fi-fe2021093048825
dc.language.isoen
dc.okm.affiliatedauthorSmolander, Johannes
dc.okm.affiliatedauthorKhan, Sofia
dc.okm.affiliatedauthorSingaravelu, Kalaimathy
dc.okm.affiliatedauthorKauko, Leni
dc.okm.affiliatedauthorLund, Riikka
dc.okm.affiliatedauthorLaiho, Asta
dc.okm.affiliatedauthorElo, Laura
dc.okm.discipline1184 Genetics, developmental biology, physiologyen_GB
dc.okm.discipline1184 Genetiikka, kehitysbiologia, fysiologiafi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.relation.articlenumber357
dc.relation.doi10.1186/s12864-021-07686-z
dc.relation.ispartofjournalBMC Genomics
dc.relation.volume22
dc.source.identifierhttps://www.utupub.fi/handle/10024/168072
dc.titleEvaluation of tools for identifying large copy number variations from ultra-low-coverage whole-genome sequencing data
dc.year.issued2021

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
s12864-021-07686-z.pdf
Size:
2.35 MB
Format:
Adobe Portable Document Format
Description:
Publisher's PDF