Integrating Large-Scale Text Mining and Co-Expression Networks: Targeting NADP(H) Metabolism in E. coli with Event Extraction
Patrik R Jones; Yves Van de Peer; Sanna Kreula; Sofie Van Landeghem; Filip Ginter; Suwisa Kaewphan
Integrating Large-Scale Text Mining and Co-Expression Networks: Targeting NADP(H) Metabolism in E. coli with Event Extraction
Patrik R Jones
Yves Van de Peer
Sanna Kreula
Sofie Van Landeghem
Filip Ginter
Suwisa Kaewphan
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2021042715145
https://urn.fi/URN:NBN:fi-fe2021042715145
Tiivistelmä
We present an application of EVEX, a literature-scale event extraction resource, in the concrete biological use case of NADP(H)
metabolism regulation in Escherichia coli. We make extensive use of the EVEX event generalization based on gene family definitions
in Ensembl Genomes, to extract cross-species candidate regulators. We manually evaluate the resulting network so as to only preserve
correct events and facilitate its integration with microarray-based co-expression data. When analysing the combined network obtained
from text mining and co-expression, we identify 41 candidate genes involved in triangular patterns involving both subnetworks. Several
of these candidates are of particular interest, and we discuss their biological relevance further. This study is the first to present a
real-world evaluation of the EVEX resource in particular and literature-scale application of the systems emerging from the BioNLP
Shared Task series in general. We summarize the lessons learned from this use case in order to focus future development of EVEX and
similar literature-scale resources.
metabolism regulation in Escherichia coli. We make extensive use of the EVEX event generalization based on gene family definitions
in Ensembl Genomes, to extract cross-species candidate regulators. We manually evaluate the resulting network so as to only preserve
correct events and facilitate its integration with microarray-based co-expression data. When analysing the combined network obtained
from text mining and co-expression, we identify 41 candidate genes involved in triangular patterns involving both subnetworks. Several
of these candidates are of particular interest, and we discuss their biological relevance further. This study is the first to present a
real-world evaluation of the EVEX resource in particular and literature-scale application of the systems emerging from the BioNLP
Shared Task series in general. We summarize the lessons learned from this use case in order to focus future development of EVEX and
similar literature-scale resources.
Kokoelmat
- Rinnakkaistallenteet [19207]