Finding novel relationships with integrated gene-gene association network analysis of Synechocystis sp. PCC 6803 using species-independent text-mining

dc.contributor.authorSanna M. Kreula
dc.contributor.authorSuwisa Kaewphan
dc.contributor.authorFilip Ginter
dc.contributor.authorPatrik R. Jones
dc.contributor.organizationfi=kieli- ja puheteknologia|en=Language and Speech Technology|
dc.contributor.organizationfi=molekulaarinen kasvibiologia|en=Molecular Plant Biology|
dc.contributor.organizationfi=tietojenkäsittelytiede|en=Computer Science|
dc.contributor.organization-code1.2.246.10.2458963.20.47465613983
dc.contributor.organization-code1.2.246.10.2458963.20.50535969575
dc.contributor.organization-code2606803
dc.converis.publication-id32074991
dc.converis.urlhttps://research.utu.fi/converis/portal/Publication/32074991
dc.date.accessioned2022-10-28T14:14:38Z
dc.date.available2022-10-28T14:14:38Z
dc.description.abstract<p>The increasing move towards open access full-text scientific literature enhances our ability to utilize advanced text-mining methods to construct information-rich networks that no human will be able to grasp simply from 'reading the literature'. The utility of text-mining for well-studied species is obvious though the utility for less studied species, or those with no prior track-record at all, is not clear. Here we present a concept for how advanced text-mining can be used to create information-rich networks even for less well studied species and apply it to generate an open-access gene-gene association network resource for <i>Synechocystis sp.</i> PCC 6803, a representative model organism for cyanobacteria and first case-study for the methodology. By merging the text-mining network with networks generated from species-specific experimental data, network integration was used to enhance the accuracy of predicting novel interactions that are biologically relevant. A rule-based algorithm (filter) was constructed in order to automate the search for novel candidate genes with a high degree of likely association to known target genes by (1) ignoring established relationships from the existing literature, as they are already 'known', and (2) demanding multiple independent evidences for every novel and potentially relevant relationship. Using selected case studies, we demonstrate the utility of the network resource and filter to (<i>i</i>) discover novel candidate associations between different genes or proteins in the network, and (<i>ii</i>) rapidly evaluate the potential role of any one particular gene or protein. The full network is provided as an open-source resource.</p>
dc.identifier.eissn2167-8359
dc.identifier.jour-issn2167-8359
dc.identifier.olddbid187130
dc.identifier.oldhandle10024/170224
dc.identifier.urihttps://www.utupub.fi/handle/11111/42549
dc.identifier.urlhttps://peerj.com/articles/4806/
dc.identifier.urnURN:NBN:fi-fe2021042719351
dc.language.isoen
dc.okm.affiliatedauthorKreula, Sanna
dc.okm.affiliatedauthorKaewphan, Suwisa
dc.okm.affiliatedauthorGinter, Filip
dc.okm.discipline113 Computer and information sciencesen_GB
dc.okm.discipline1182 Biochemistry, cell and molecular biologyen_GB
dc.okm.discipline1183 Plant biology, microbiology, virologyen_GB
dc.okm.discipline113 Tietojenkäsittely ja informaatiotieteetfi_FI
dc.okm.discipline1182 Biokemia, solu- ja molekyylibiologiafi_FI
dc.okm.discipline1183 Kasvibiologia, mikrobiologia, virologiafi_FI
dc.okm.internationalcopublicationinternational co-publication
dc.okm.internationalityInternational publication
dc.okm.typeA1 ScientificArticle
dc.publisher.countryUnited Kingdomen_GB
dc.publisher.countryBritanniafi_FI
dc.publisher.country-codeGB
dc.relation.articlenumber29844966
dc.relation.doi10.7717/peerj.4806
dc.relation.ispartofjournalPeerJ
dc.relation.issuee4806
dc.relation.volume6
dc.source.identifierhttps://www.utupub.fi/handle/10024/170224
dc.titleFinding novel relationships with integrated gene-gene association network analysis of Synechocystis sp. PCC 6803 using species-independent text-mining
dc.year.issued2018

Tiedostot

Näytetään 1 - 1 / 1
Ladataan...
Name:
4806.pdf
Size:
4.24 MB
Format:
Adobe Portable Document Format
Description:
Publisher's version