• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用生物医学文献挖掘和基于图的影响力最大化方法识别对胃肠道癌最具影响力的共现基因集

Identification of most influential co-occurring gene suites for gastrointestinal cancer using biomedical literature mining and graph-based influence maximization.

作者信息

Wang Charles C N, Jin Jennifer, Chang Jan-Gowth, Hayakawa Masahiro, Kitazawa Atsushi, Tsai Jeffrey J P, Sheu Phillip C-Y

机构信息

Department of Bioinformatics and Medical Engineering, Asia University, Taichung, Taiwan.

Center for Artificial Intelligence in Precision Medicine, UAsia University, Taichung, Taiwan.

出版信息

BMC Med Inform Decis Mak. 2020 Sep 3;20(1):208. doi: 10.1186/s12911-020-01227-6.

DOI:10.1186/s12911-020-01227-6
PMID:32883271
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7469322/
Abstract

BACKGROUND

Gastrointestinal (GI) cancer including colorectal cancer, gastric cancer, pancreatic cancer, etc., are among the most frequent malignancies diagnosed annually and represent a major public health problem worldwide.

METHODS

This paper reports an aided curation pipeline to identify potential influential genes for gastrointestinal cancer. The curation pipeline integrates biomedical literature to identify named entities by Bi-LSTM-CNN-CRF methods. The entities and their associations can be used to construct a graph, and from which we can compute the sets of co-occurring genes that are the most influential based on an influence maximization algorithm.

RESULTS

The sets of co-occurring genes that are the most influential that we discover include RARA - CRBP1, CASP3 - BCL2, BCL2 - CASP3 - CRBP1, RARA - CASP3 - CRBP1, FOXJ1 - RASSF3 - ESR1, FOXJ1 - RASSF1A - ESR1, FOXJ1 - RASSF1A - TNFAIP8 - ESR1. With TCGA and functional and pathway enrichment analysis, we prove the proposed approach works well in the context of gastrointestinal cancer.

CONCLUSIONS

Our pipeline that uses text mining to identify objects and relationships to construct a graph and uses graph-based influence maximization to discover the most influential co-occurring genes presents a viable direction to assist knowledge discovery for clinical applications.

摘要

背景

胃肠道癌,包括结直肠癌、胃癌、胰腺癌等,是每年诊断出的最常见恶性肿瘤之一,也是全球主要的公共卫生问题。

方法

本文报告了一种辅助筛选流程,用于识别胃肠道癌的潜在影响基因。该筛选流程整合生物医学文献,通过双向长短期记忆网络-卷积神经网络-条件随机场(Bi-LSTM-CNN-CRF)方法识别命名实体。这些实体及其关联关系可用于构建一个图,基于影响最大化算法,我们可以从中计算出最具影响力的共现基因集。

结果

我们发现的最具影响力的共现基因集包括视黄酸受体α(RARA)-细胞视黄醇结合蛋白1(CRBP1)、半胱天冬酶3(CASP3)-B细胞淋巴瘤2(BCL2)、BCL2-CASP3-CRBP1、RARA-CASP3-CRBP1、叉头框蛋白J1(FOXJ1)-RAS相关结构域家族成员3(RASSF3)-雌激素受体1(ESR1)、FOXJ1-RASSF1A-ESR1、FOXJ1-RASSF1A-肿瘤坏死因子α诱导蛋白8(TNFAIP8)-ESR1。通过癌症基因组图谱(TCGA)以及功能和通路富集分析证实,我们提出的方法在胃肠道癌背景下效果良好。

结论

我们的流程利用文本挖掘来识别对象和关系以构建一个图,并使用基于图的影响最大化来发现最具影响力的共现基因,为临床应用的知识发现提供了一个可行的方向。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f076/7469322/c1312a6459a3/12911_2020_1227_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f076/7469322/ea27f283ee7a/12911_2020_1227_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f076/7469322/906152ef31a0/12911_2020_1227_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f076/7469322/c1312a6459a3/12911_2020_1227_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f076/7469322/ea27f283ee7a/12911_2020_1227_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f076/7469322/906152ef31a0/12911_2020_1227_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f076/7469322/c1312a6459a3/12911_2020_1227_Fig3_HTML.jpg

相似文献

1
Identification of most influential co-occurring gene suites for gastrointestinal cancer using biomedical literature mining and graph-based influence maximization.利用生物医学文献挖掘和基于图的影响力最大化方法识别对胃肠道癌最具影响力的共现基因集
BMC Med Inform Decis Mak. 2020 Sep 3;20(1):208. doi: 10.1186/s12911-020-01227-6.
2
GIDB: a knowledge database for the automated curation and multidimensional analysis of molecular signatures in gastrointestinal cancer.GIDB:一个用于胃肠道癌症分子特征自动编纂和多维分析的知识库。
Database (Oxford). 2019 Jan 1;2019. doi: 10.1093/database/baz051.
3
Graph-based biomedical text summarization: An itemset mining and sentence clustering approach.基于图的生物医学文本摘要:一种基于项集挖掘和句子聚类的方法。
J Biomed Inform. 2018 Aug;84:42-58. doi: 10.1016/j.jbi.2018.06.005. Epub 2018 Jun 15.
4
SemaTyP: a knowledge graph based literature mining method for drug discovery.SemaTyP:一种基于知识图谱的药物发现文献挖掘方法。
BMC Bioinformatics. 2018 May 30;19(1):193. doi: 10.1186/s12859-018-2167-5.
5
Entity recognition in Chinese clinical text using attention-based CNN-LSTM-CRF.基于注意力机制的卷积神经网络-长短时记忆网络-条件随机场在中文临床文本中的实体识别。
BMC Med Inform Decis Mak. 2019 Apr 4;19(Suppl 3):74. doi: 10.1186/s12911-019-0787-y.
6
Coupling Graphs, Efficient Algorithms and B-Cell Epitope Prediction.耦合图、高效算法与B细胞表位预测
IEEE/ACM Trans Comput Biol Bioinform. 2014 Jan-Feb;11(1):7-16. doi: 10.1109/TCBB.2013.136.
7
Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research.从文本和大规模数据分析中提取基因与疾病之间的关系:对转化研究的启示。
BMC Bioinformatics. 2015 Feb 21;16:55. doi: 10.1186/s12859-015-0472-9.
8
Multi-way association extraction and visualization from biological text documents using hyper-graphs: applications to genetic association studies for diseases.使用超图从生物文本文档中进行多方面关联提取和可视化:在疾病的遗传关联研究中的应用。
Artif Intell Med. 2010 Jul;49(3):145-54. doi: 10.1016/j.artmed.2010.03.002. Epub 2010 Apr 9.
9
edge2vec: Representation learning using edge semantics for biomedical knowledge discovery.边向量模型:利用边语义的表示学习方法进行生物医学知识发现。
BMC Bioinformatics. 2019 Jun 10;20(1):306. doi: 10.1186/s12859-019-2914-2.
10
Automatic Human-like Mining and Constructing Reliable Genetic Association Database with Deep Reinforcement Learning.利用深度强化学习实现类人自动挖掘与构建可靠的基因关联数据库
Pac Symp Biocomput. 2019;24:112-123.

引用本文的文献

1
Contextualizing Genes by Using Text-Mined Co-Occurrence Features for Cancer Gene Panel Discovery.利用文本挖掘共现特征为癌症基因panel发现情境化基因
Front Genet. 2021 Oct 25;12:771435. doi: 10.3389/fgene.2021.771435. eCollection 2021.

本文引用的文献

1
Application of deep learning methods in biological networks.深度学习方法在生物网络中的应用。
Brief Bioinform. 2021 Mar 22;22(2):1902-1917. doi: 10.1093/bib/bbaa043.
2
Biomedical named entity recognition using deep neural networks with contextual information.基于上下文信息的深度神经网络的生物医学命名实体识别。
BMC Bioinformatics. 2019 Dec 27;20(1):735. doi: 10.1186/s12859-019-3321-4.
3
Biomarker identification of hepatocellular carcinoma using a methodical literature mining strategy.使用系统文献挖掘策略对肝细胞癌进行生物标志物鉴定。
Database (Oxford). 2017 Jan 1;2017. doi: 10.1093/database/bax082.
4
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.
5
PubTator central: automated concept annotation for biomedical full text articles.PubTator 中心:用于生物医学全文文章的自动概念标注。
Nucleic Acids Res. 2019 Jul 2;47(W1):W587-W593. doi: 10.1093/nar/gkz389.
6
PMC text mining subset in BioC: about three million full-text articles and growing.PMC 文本挖掘子集在 BioC 中:约三百万篇全文文章且还在不断增加。
Bioinformatics. 2019 Sep 15;35(18):3533-3535. doi: 10.1093/bioinformatics/btz070.
7
Cross-Database Analysis Reveals Sensitive Biomarkers for Combined Therapy for ERBB2+ Gastric Cancer.跨数据库分析揭示了用于ERBB2+胃癌联合治疗的敏感生物标志物。
Front Pharmacol. 2018 Aug 3;9:861. doi: 10.3389/fphar.2018.00861. eCollection 2018.
8
LitVar: a semantic search engine for linking genomic variant data in PubMed and PMC.LitVar:一个语义搜索引擎,用于在 PubMed 和 PMC 中链接基因组变异数据。
Nucleic Acids Res. 2018 Jul 2;46(W1):W530-W536. doi: 10.1093/nar/gky355.
9
D3NER: biomedical named entity recognition using CRF-biLSTM improved with fine-tuned embeddings of various linguistic information.D3NER:基于条件随机场-双向长短期记忆网络的生物医学命名实体识别,通过各种语言信息的微调嵌入得到改进。
Bioinformatics. 2018 Oct 15;34(20):3539-3546. doi: 10.1093/bioinformatics/bty356.
10
A comprehensive and quantitative comparison of text-mining in 15 million full-text articles versus their corresponding abstracts.全面且定量地比较了 1500 万篇全文文章及其相应摘要中的文本挖掘。
PLoS Comput Biol. 2018 Feb 15;14(2):e1005962. doi: 10.1371/journal.pcbi.1005962. eCollection 2018 Feb.