Kuura—An automated workflow for analyzing WES and WGS data

dc.contributor.authorJambulingam Dhanaprakash
dc.contributor.authorRathinakannan Venkat Subramaniam
dc.contributor.authorHeron Samuel
dc.contributor.authorSchleutker Johanna
dc.contributor.authorFey Vidal
dc.contributor.organizationfi=biolääketieteen laitos|en=Institute of Biomedicine|
dc.contributor.organizationfi=tyks, vsshp|en=tyks, varha|
dc.contributor.organization-code1.2.246.10.2458963.20.77952289591
dc.contributor.organization-code2607100
dc.converis.publication-id387230360
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/387230360
dc.date.accessioned2025-08-28T01:08:57Z
dc.date.available2025-08-28T01:08:57Z
dc.description.abstract<p>The advent of high-throughput sequencing technologies has revolutionized the field of genomic sciences by cutting down the cost and time associated with standard sequencing methods. This advancement has not only provided the research community with an abundance of data but has also presented the challenge of analyzing it. The paramount challenge in analyzing the copious amount of data is in using the optimal resources in terms of available tools. To address this research gap, we propose "Kuura-An automated workflow for analyzing WES and WGS data", which is optimized for both whole exome and whole genome sequencing data. This workflow is based on the nextflow pipeline scripting language and uses docker to manage and deploy the workflow. The workflow consists of four analysis stages-quality control, mapping to reference genome & quality score recalibration, variant calling & variant recalibration and variant consensus & annotation. An important feature of the DNA-seq workflow is that it uses the combination of multiple variant callers (GATK Haplotypecaller, DeepVariant, VarScan2, Freebayes and Strelka2), generating a list of high-confidence variants in a consensus call file. The workflow is flexible as it integrates the fragmented tools and can be easily extended by adding or updating tools or amending the parameters list. The use of a single parameters file enhances reproducibility of the results. The ease of deployment and usage of the workflow further increases computational reproducibility providing researchers with a standardized tool for the variant calling step in different projects. The source code, instructions for installation and use of the tool are publicly available at our github repository https://github.com/dhanaprakashj/kuura_pipeline.<br></p>
dc.identifier.eissn1932-6203
dc.identifier.jour-issn1932-6203
dc.identifier.olddbid207103
dc.identifier.oldhandle10024/190130
dc.identifier.urihttps://www.utupub.fi/handle/11111/50299
dc.identifier.urlhttps://journals.plos.org/plosone/article?id=10.1371/journal.pone.0296785
dc.identifier.urnURN:NBN:fi-fe2025082787560
dc.language.isoen
dc.okm.affiliatedauthorJambulingam, Dhanaprakash
dc.okm.affiliatedauthorRathinakannan, Venkat
dc.okm.affiliatedauthorHeron, Samuel
dc.okm.affiliatedauthorSchleutker, Johanna
dc.okm.affiliatedauthorFey, Vidal
dc.okm.affiliatedauthorDataimport, tyks, vsshp
dc.okm.discipline3111 Biomedicineen_GB
dc.okm.discipline318 Medical biotechnologyen_GB
dc.okm.discipline3111 Biolääketieteetfi_FI
dc.okm.discipline318 Lääketieteen bioteknologiafi_FI
dc.okm.internationalcopublicationnot an international co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisherPublic Library of Science (PLoS)
dc.publisher.countryUnited Statesen_GB
dc.publisher.countryYhdysvallat (USA)fi_FI
dc.publisher.country-codeUS
dc.relation.doi10.1371/journal.pone.0296785
dc.relation.ispartofjournalPLoS ONE
dc.relation.issue1
dc.relation.volume19
dc.source.identifierhttps://www.utupub.fi/handle/10024/190130
dc.titleKuura—An automated workflow for analyzing WES and WGS data
dc.year.issued2024

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
journal.pone.0296785.pdf
Size:
1.42 MB
Format:
Adobe Portable Document Format