Benchmarking methods for detecting differential states between conditions from multi-subject single-cell RNA-seq data

Junttila Sini; Smolander Johannes; Elo Laura L.

Benchmarking methods for detecting differential states between conditions from multi-subject single-cell RNA-seq data

dc.contributor.author	Junttila Sini
dc.contributor.author	Smolander Johannes
dc.contributor.author	Elo Laura L.
dc.contributor.organization	fi=Turun biotiedekeskus\|en=Turku Bioscience Centre\|
dc.contributor.organization-code	1.2.246.10.2458963.20.18586209670
dc.converis.publication-id	176235024
dc.converis.url	https://research.utu.fi/converis/portal/Publication/176235024
dc.date.accessioned	2022-10-28T12:27:37Z
dc.date.available	2022-10-28T12:27:37Z
dc.description.abstract	<p>Single-cell RNA-sequencing (scRNA-seq) enables researchers to quantify transcriptomes of thousands of cells simultaneously and study transcriptomic changes between cells. scRNA-seq datasets increasingly include multisubject, multicondition experiments to investigate cell-type-specific differential states (DS) between conditions. This can be performed by first identifying the cell types in all the subjects and then by performing a DS analysis between the conditions within each cell type. Naive single-cell DS analysis methods that treat cells statistically independent are subject to false positives in the presence of variation between biological replicates, an issue known as the pseudoreplicate bias. While several methods have already been introduced to carry out the statistical testing in multisubject scRNA-seq analysis, comparisons that include all these methods are currently lacking. Here, we performed a comprehensive comparison of 18 methods for the identification of DS changes between conditions from multisubject scRNA-seq data. Our results suggest that the pseudobulk methods performed generally best. Both pseudobulks and mixed models that model the subjects as a random effect were superior compared with the naive single-cell methods that do not model the subjects in any way. While the naive models achieved higher sensitivity than the pseudobulk methods and the mixed models, they were subject to a high number of false positives. In addition, accounting for subjects through latent variable modeling did not improve the performance of the naive methods.<br></p>
dc.identifier.eissn	1477-4054
dc.identifier.jour-issn	1467-5463
dc.identifier.olddbid	176553
dc.identifier.oldhandle	10024/159647
dc.identifier.uri	https://www.utupub.fi/handle/11111/32034
dc.identifier.url	https://doi.org/10.1093/bib/bbac286
dc.identifier.urn	URN:NBN:fi-fe2022091258554
dc.language.iso	en
dc.okm.affiliatedauthor	Junttila, Sini
dc.okm.affiliatedauthor	Smolander, Johannes
dc.okm.affiliatedauthor	Elo, Laura
dc.okm.discipline	3111 Biomedicine	en_GB
dc.okm.discipline	3111 Biolääketieteet	fi_FI
dc.okm.internationalcopublication	not an international co-publication
dc.okm.internationality	International publication
dc.okm.type	A1 ScientificArticle
dc.publisher	OXFORD UNIV PRESS
dc.publisher.country	United Kingdom	en_GB
dc.publisher.country	Britannia	fi_FI
dc.publisher.country-code	GB
dc.relation.doi	10.1093/bib/bbac286
dc.relation.ispartofjournal	Briefings in Bioinformatics
dc.source.identifier	https://www.utupub.fi/handle/10024/159647
dc.title	Benchmarking methods for detecting differential states between conditions from multi-subject single-cell RNA-seq data
dc.year.issued	2022

Tiedostot

Näytetään 1 - 1 / 1

Name:: Junttila_etAl2022_Article_BenchmarkingMethods.pdf
Size:: 1.67 MB
Format:: Adobe Portable Document Format

Lataa

Kokoelmat

Rinnakkaistallenteet