• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用公开可用数据集进行结直肠癌的临床-基因组关联挖掘

Clinic-genomic association mining for colorectal cancer using publicly available datasets.

作者信息

Liu Fang, Feng Yaning, Li Zhenye, Pan Chao, Su Yuncong, Yang Rui, Song Liying, Duan Huilong, Deng Ning

机构信息

Department of Biomedical Engineering, Key Laboratory for Biomedical Engineering of Ministry of Education, Zhejiang University, Hangzhou 310027, China.

General Hospital of Ningxia Medical University, Yinchuan 750004, China.

出版信息

Biomed Res Int. 2014;2014:170289. doi: 10.1155/2014/170289. Epub 2014 Jun 2.

DOI:10.1155/2014/170289
PMID:24987669
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4060771/
Abstract

In recent years, a growing number of researchers began to focus on how to establish associations between clinical and genomic data. However, up to now, there is lack of research mining clinic-genomic associations by comprehensively analysing available gene expression data for a single disease. Colorectal cancer is one of the malignant tumours. A number of genetic syndromes have been proven to be associated with colorectal cancer. This paper presents our research on mining clinic-genomic associations for colorectal cancer under biomedical big data environment. The proposed method is engineered with multiple technologies, including extracting clinical concepts using the unified medical language system (UMLS), extracting genes through the literature mining, and mining clinic-genomic associations through statistical analysis. We applied this method to datasets extracted from both gene expression omnibus (GEO) and genetic association database (GAD). A total of 23,517 clinic-genomic associations between 139 clinical concepts and 7914 genes were obtained, of which 3474 associations between 31 clinical concepts and 1689 genes were identified as highly reliable ones. Evaluation and interpretation were performed using UMLS, KEGG, and Gephi, and potential new discoveries were explored. The proposed method is effective in mining valuable knowledge from available biomedical big data and achieves a good performance in bridging clinical data with genomic data for colorectal cancer.

摘要

近年来,越来越多的研究人员开始关注如何建立临床数据与基因组数据之间的关联。然而,截至目前,缺乏通过全面分析单一疾病的可用基因表达数据来挖掘临床-基因组关联的研究。结直肠癌是恶性肿瘤之一。许多遗传综合征已被证明与结直肠癌有关。本文介绍了我们在生物医学大数据环境下挖掘结直肠癌临床-基因组关联的研究。所提出的方法采用了多种技术构建而成,包括使用统一医学语言系统(UMLS)提取临床概念、通过文献挖掘提取基因以及通过统计分析挖掘临床-基因组关联。我们将此方法应用于从基因表达综合数据库(GEO)和遗传关联数据库(GAD)中提取的数据集。共获得了139个临床概念与7914个基因之间的23517个临床-基因组关联,其中31个临床概念与1689个基因之间的3474个关联被确定为高度可靠的关联。使用UMLS、KEGG和Gephi进行了评估和解读,并探索了潜在的新发现。所提出的方法在从可用的生物医学大数据中挖掘有价值的知识方面是有效的,并且在将临床数据与结直肠癌的基因组数据相连接方面取得了良好的效果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f36/4060771/00be014a0bee/BMRI2014-170289.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f36/4060771/135f4ad66579/BMRI2014-170289.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f36/4060771/4c929baddbd1/BMRI2014-170289.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f36/4060771/360c9d6fca49/BMRI2014-170289.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f36/4060771/552c58fd971c/BMRI2014-170289.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f36/4060771/00be014a0bee/BMRI2014-170289.005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f36/4060771/135f4ad66579/BMRI2014-170289.001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f36/4060771/4c929baddbd1/BMRI2014-170289.002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f36/4060771/360c9d6fca49/BMRI2014-170289.003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f36/4060771/552c58fd971c/BMRI2014-170289.004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f36/4060771/00be014a0bee/BMRI2014-170289.005.jpg

相似文献

1
Clinic-genomic association mining for colorectal cancer using publicly available datasets.利用公开可用数据集进行结直肠癌的临床-基因组关联挖掘
Biomed Res Int. 2014;2014:170289. doi: 10.1155/2014/170289. Epub 2014 Jun 2.
2
Integrating unified medical language system and association mining techniques into relevance feedback for biomedical literature search.将统一医学语言系统和关联挖掘技术集成到生物医学文献检索的相关反馈中。
BMC Bioinformatics. 2016 Jul 19;17 Suppl 9(Suppl 9):264. doi: 10.1186/s12859-016-1129-z.
3
The Generation and Validation of a 20-Genes Model Influencing the Prognosis of Colorectal Cancer.影响结直肠癌预后的20基因模型的构建与验证
J Cell Biochem. 2017 Nov;118(11):3675-3685. doi: 10.1002/jcb.26013. Epub 2017 May 30.
4
TCGA4U: A Web-Based Genomic Analysis Platform To Explore And Mine TCGA Genomic Data For Translational Research.TCGA4U:一个基于网络的基因组分析平台,用于探索和挖掘TCGA基因组数据以进行转化研究。
Stud Health Technol Inform. 2015;216:658-62.
5
Automatic Human-like Mining and Constructing Reliable Genetic Association Database with Deep Reinforcement Learning.利用深度强化学习实现类人自动挖掘与构建可靠的基因关联数据库
Pac Symp Biocomput. 2019;24:112-123.
6
Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature.文本挖掘有助于数据库管理——从生物医学文献中提取突变与疾病的关联。
BMC Bioinformatics. 2015 Jun 6;16:185. doi: 10.1186/s12859-015-0609-x.
7
Identification of Gene Expression Pattern Related to Breast Cancer Survival Using Integrated TCGA Datasets and Genomic Tools.使用整合的TCGA数据集和基因组工具鉴定与乳腺癌生存相关的基因表达模式
Biomed Res Int. 2015;2015:878546. doi: 10.1155/2015/878546. Epub 2015 Oct 20.
8
Integrative literature and data mining to rank disease candidate genes.整合文献与数据挖掘以对疾病候选基因进行排名。
Methods Mol Biol. 2014;1159:207-26. doi: 10.1007/978-1-4939-0709-0_12.
9
Literature-aided interpretation of gene expression data with the weighted global test.基于加权全局检验的基因表达数据文献辅助解读。
Brief Bioinform. 2011 Sep;12(5):518-29. doi: 10.1093/bib/bbq082. Epub 2010 Dec 22.
10
MetReS, an Efficient Database for Genomic Applications.MetReS,一个适用于基因组应用的高效数据库。
J Comput Biol. 2018 Feb;25(2):200-213. doi: 10.1089/cmb.2017.0103. Epub 2017 Nov 29.

引用本文的文献

1
Text-mining in cancer research may help identify effective treatments.癌症研究中的文本挖掘可能有助于识别有效的治疗方法。
Transl Lung Cancer Res. 2019 Dec;8(Suppl 4):S460-S463. doi: 10.21037/tlcr.2019.12.20.
2
A Comparison of High Dimensional Variable Selection Methods with Missing Covariates in a Prostate Cancer Study.前列腺癌研究中具有缺失协变量的高维变量选择方法比较
Commun Stat Case Stud Data Anal Appl. 2018;4(2):82-95. doi: 10.1080/23737484.2018.1521315. Epub 2019 Apr 10.
3
ERICH3 in Primary Cilia Regulates Cilium Formation and the Localisations of Ciliary Transport and Sonic Hedgehog Signaling Proteins.

本文引用的文献

1
crcTRP: a translational research platform for colorectal cancer.CRC-TRP:结直肠癌的转化研究平台。
Comput Math Methods Med. 2013;2013:930362. doi: 10.1155/2013/930362. Epub 2013 Jan 29.
2
Cancer statistics, 2013.癌症统计数据,2013 年。
CA Cancer J Clin. 2013 Jan;63(1):11-30. doi: 10.3322/caac.21166. Epub 2013 Jan 17.
3
Osteoporosis is associated with the risk of colorectal adenoma in women.骨质疏松症与女性结直肠腺瘤的风险相关。
ERICH3 在初级纤毛中调节纤毛形成以及纤毛运输和 Sonic Hedgehog 信号蛋白的定位。
Sci Rep. 2019 Nov 11;9(1):16519. doi: 10.1038/s41598-019-52830-1.
4
Quantitative Proteomic Analysis of Human Airway Cilia Identifies Previously Uncharacterized Proteins of High Abundance.人呼吸道纤毛的定量蛋白质组学分析鉴定出此前未被表征的高丰度蛋白质。
J Proteome Res. 2017 Apr 7;16(4):1579-1592. doi: 10.1021/acs.jproteome.6b00972. Epub 2017 Mar 27.
5
Big Data and Comparative Effectiveness Research in Radiation Oncology: Synergy and Accelerated Discovery.放射肿瘤学中的大数据与比较效果研究:协同作用与加速发现
Front Oncol. 2015 Dec 8;5:274. doi: 10.3389/fonc.2015.00274. eCollection 2015.
Dis Colon Rectum. 2013 Feb;56(2):169-74. doi: 10.1097/DCR.0b013e31826f8338.
4
Global cancer transitions according to the Human Development Index (2008-2030): a population-based study.全球癌症发病趋势与人类发展指数(2008-2030 年):基于人群的研究。
Lancet Oncol. 2012 Aug;13(8):790-801. doi: 10.1016/S1470-2045(12)70211-5. Epub 2012 Jun 1.
5
Strategies to explore functional genomics data sets in NCBI's GEO database.探索美国国立医学图书馆基因表达综合数据库(NCBI's GEO database)中功能基因组学数据集的策略。
Methods Mol Biol. 2012;802:41-53. doi: 10.1007/978-1-61779-400-1_3.
6
iCOD: an integrated clinical omics database based on the systems-pathology view of disease.iCOD:基于疾病系统病理学观点的综合临床组学数据库。
BMC Genomics. 2010 Dec 2;11 Suppl 4(Suppl 4):S19. doi: 10.1186/1471-2164-11-S4-S19.
7
Cancer statistics, 2009.2009年癌症统计数据。
CA Cancer J Clin. 2009 Jul-Aug;59(4):225-49. doi: 10.3322/caac.20006. Epub 2009 May 27.
8
Enabling integrative genomic analysis of high-impact human diseases through text mining.通过文本挖掘实现对重大人类疾病的综合基因组分析。
Pac Symp Biocomput. 2008:580-91.
9
The human disease network.人类疾病网络。
Proc Natl Acad Sci U S A. 2007 May 22;104(21):8685-90. doi: 10.1073/pnas.0701361104. Epub 2007 May 14.
10
IDconverter and IDClight: conversion and annotation of gene and protein IDs.IDconverter和IDClight:基因和蛋白质ID的转换与注释
BMC Bioinformatics. 2007 Jan 10;8:9. doi: 10.1186/1471-2105-8-9.