• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CamurWeb:一个癌症基因表达数据的分类软件和大型知识库。

CamurWeb: a classification software and a large knowledge base for gene expression data of cancer.

机构信息

Department of Engineering, Uninettuno International University, Corso Vittorio Emanuele II 39, Rome, 00186, Italy.

Institute of Systems Analysis and Computer Science "A. Ruberti", National Research Council, Via dei Taurini 19, Rome, 00185, Italy.

出版信息

BMC Bioinformatics. 2018 Oct 15;19(Suppl 10):354. doi: 10.1186/s12859-018-2299-7.

DOI:10.1186/s12859-018-2299-7
PMID:30367574
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6191971/
Abstract

BACKGROUND

The high growth of Next Generation Sequencing data currently demands new knowledge extraction methods. In particular, the RNA sequencing gene expression experimental technique stands out for case-control studies on cancer, which can be addressed with supervised machine learning techniques able to extract human interpretable models composed of genes, and their relation to the investigated disease. State of the art rule-based classifiers are designed to extract a single classification model, possibly composed of few relevant genes. Conversely, we aim to create a large knowledge base composed of many rule-based models, and thus determine which genes could be potentially involved in the analyzed tumor. This comprehensive and open access knowledge base is required to disseminate novel insights about cancer.

RESULTS

We propose CamurWeb, a new method and web-based software that is able to extract multiple and equivalent classification models in form of logic formulas ("if then" rules) and to create a knowledge base of these rules that can be queried and analyzed. The method is based on an iterative classification procedure and an adaptive feature elimination technique that enables the computation of many rule-based models related to the cancer under study. Additionally, CamurWeb includes a user friendly interface for running the software, querying the results, and managing the performed experiments. The user can create her profile, upload her gene expression data, run the classification analyses, and interpret the results with predefined queries. In order to validate the software we apply it to all public available RNA sequencing datasets from The Cancer Genome Atlas database obtaining a large open access knowledge base about cancer. CamurWeb is available at http://bioinformatics.iasi.cnr.it/camurweb .

CONCLUSIONS

The experiments prove the validity of CamurWeb, obtaining many classification models and thus several genes that are associated to 21 different cancer types. Finally, the comprehensive knowledge base about cancer and the software tool are released online; interested researchers have free access to them for further studies and to design biological experiments in cancer research.

摘要

背景

下一代测序数据的高速增长目前需要新的知识提取方法。特别是 RNA 测序基因表达实验技术在癌症的病例对照研究中脱颖而出,可以采用监督机器学习技术来提取由基因及其与所研究疾病的关系组成的人类可解释模型。基于规则的最新分类器旨在提取单一的分类模型,该模型可能由少数相关基因组成。相反,我们旨在创建一个由许多基于规则的模型组成的大型知识库,并确定哪些基因可能潜在参与分析中的肿瘤。这个全面的、开放获取的知识库是传播有关癌症的新见解所必需的。

结果

我们提出了 CamurWeb,这是一种新的方法和基于网络的软件,能够以逻辑公式(“如果那么”规则)的形式提取多个等效的分类模型,并创建这些规则的知识库,可对其进行查询和分析。该方法基于迭代分类过程和自适应特征消除技术,能够计算与所研究癌症相关的许多基于规则的模型。此外,CamurWeb 包括一个用户友好的界面,用于运行软件、查询结果和管理执行的实验。用户可以创建自己的个人资料,上传她的基因表达数据,运行分类分析,并使用预定义查询解释结果。为了验证软件,我们将其应用于来自癌症基因组图谱数据库的所有公开可用的 RNA 测序数据集,获得了一个关于癌症的大型开放获取知识库。CamurWeb 可在 http://bioinformatics.iasi.cnr.it/camurweb 上获得。

结论

实验证明了 CamurWeb 的有效性,获得了许多分类模型,从而获得了与 21 种不同癌症类型相关的多个基因。最后,癌症的综合知识库和软件工具在线发布;感兴趣的研究人员可以免费访问它们,以进行进一步的研究和设计癌症研究中的生物学实验。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a31/6191971/619002801dd9/12859_2018_2299_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a31/6191971/872630a18c8a/12859_2018_2299_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a31/6191971/b185bbd7c46e/12859_2018_2299_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a31/6191971/fbc682767028/12859_2018_2299_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a31/6191971/2dcae8cd5cfa/12859_2018_2299_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a31/6191971/619002801dd9/12859_2018_2299_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a31/6191971/872630a18c8a/12859_2018_2299_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a31/6191971/b185bbd7c46e/12859_2018_2299_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a31/6191971/fbc682767028/12859_2018_2299_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a31/6191971/2dcae8cd5cfa/12859_2018_2299_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8a31/6191971/619002801dd9/12859_2018_2299_Fig5_HTML.jpg

相似文献

1
CamurWeb: a classification software and a large knowledge base for gene expression data of cancer.CamurWeb:一个癌症基因表达数据的分类软件和大型知识库。
BMC Bioinformatics. 2018 Oct 15;19(Suppl 10):354. doi: 10.1186/s12859-018-2299-7.
2
CAMUR: Knowledge extraction from RNA-seq cancer data through equivalent classification rules.CAMUR:通过等效分类规则从RNA测序癌症数据中提取知识。
Bioinformatics. 2016 Mar 1;32(5):697-704. doi: 10.1093/bioinformatics/btv635. Epub 2015 Oct 30.
3
PanClassif: Improving pan cancer classification of single cell RNA-seq gene expression data using machine learning.PanClassif:使用机器学习改进单细胞RNA测序基因表达数据的泛癌分类
Genomics. 2022 Mar;114(2):110264. doi: 10.1016/j.ygeno.2022.01.001. Epub 2022 Jan 6.
4
Combining DNA methylation and RNA sequencing data of cancer for supervised knowledge extraction.结合癌症的DNA甲基化和RNA测序数据进行监督式知识提取。
BioData Min. 2018 Oct 25;11:22. doi: 10.1186/s13040-018-0184-6. eCollection 2018.
5
An automated knowledge-based textual summarization system for longitudinal, multivariate clinical data.一种用于纵向多变量临床数据的基于知识的自动文本摘要系统。
J Biomed Inform. 2016 Jun;61:159-75. doi: 10.1016/j.jbi.2016.03.022. Epub 2016 Mar 30.
6
Public sector reforms and their impact on the level of corruption: A systematic review.公共部门改革及其对腐败程度的影响:一项系统综述。
Campbell Syst Rev. 2021 May 24;17(2):e1173. doi: 10.1002/cl2.1173. eCollection 2021 Jun.
7
Dissecting the biological relationship between TCGA miRNA and mRNA sequencing data using MMiRNA-Viewer.使用MMiRNA-Viewer剖析TCGA miRNA与mRNA测序数据之间的生物学关系。
BMC Bioinformatics. 2016 Oct 6;17(Suppl 13):336. doi: 10.1186/s12859-016-1219-y.
8
miRNAfe: A comprehensive tool for feature extraction in microRNA prediction.miRNAfe:一种用于微小RNA预测中特征提取的综合工具。
Biosystems. 2015 Dec;138:1-5. doi: 10.1016/j.biosystems.2015.10.003. Epub 2015 Oct 20.
9
GECKO: a complete large-scale gene expression analysis platform.壁虎:一个完整的大规模基因表达分析平台。
BMC Bioinformatics. 2004 Dec 10;5:195. doi: 10.1186/1471-2105-5-195.
10
A multiple kernel support vector machine scheme for feature selection and rule extraction from gene expression data of cancer tissue.一种用于从癌组织基因表达数据中进行特征选择和规则提取的多核支持向量机方案。
Artif Intell Med. 2007 Oct;41(2):161-75. doi: 10.1016/j.artmed.2007.07.008. Epub 2007 Sep 11.

引用本文的文献

1
Distinguishing Rectal Cancer from Colon Cancer Based on the Support Vector Machine Method and RNA-sequencing Data.基于支持向量机方法和 RNA 测序数据区分直肠癌和结肠癌。
Curr Med Sci. 2021 Apr;41(2):368-374. doi: 10.1007/s11596-021-2356-8. Epub 2021 Apr 20.
2
Knowledge Generation with Rule Induction in Cancer Omics.基于癌症组学的规则归纳的知识生成
Int J Mol Sci. 2019 Dec 18;21(1):18. doi: 10.3390/ijms21010018.
3
Comparing biological information contained in mRNA and non-coding RNAs for classification of lung cancer patients.

本文引用的文献

1
A Comprehensive Infrastructure for Big Data in Cancer Research: Accelerating Cancer Research and Precision Medicine.癌症研究中的大数据综合基础设施:加速癌症研究与精准医学
Front Cell Dev Biol. 2017 Sep 21;5:83. doi: 10.3389/fcell.2017.00083. eCollection 2017.
2
The NCI Genomic Data Commons as an engine for precision medicine.美国国立癌症研究所基因组数据共享库作为精准医学的引擎。
Blood. 2017 Jul 27;130(4):453-459. doi: 10.1182/blood-2017-03-735654. Epub 2017 Jun 9.
3
TCGA2BED: extracting, extending, integrating, and querying The Cancer Genome Atlas.
比较 mRNA 和非编码 RNA 中包含的生物学信息,以对肺癌患者进行分类。
BMC Cancer. 2019 Dec 3;19(1):1176. doi: 10.1186/s12885-019-6338-1.
4
BITS 2017: the annual meeting of the Italian Society of Bioinformatics.BITS 2017:意大利生物信息学会年会。
BMC Bioinformatics. 2018 Oct 15;19(Suppl 10):352. doi: 10.1186/s12859-018-2295-y.
TCGA2BED:提取、扩展、整合和查询癌症基因组图谱
BMC Bioinformatics. 2017 Jan 3;18(1):6. doi: 10.1186/s12859-016-1419-5.
4
Adhesion GPCR Function in Pulmonary Development and Disease.粘附G蛋白偶联受体在肺发育和疾病中的功能
Handb Exp Pharmacol. 2016;234:309-327. doi: 10.1007/978-3-319-41523-9_14.
5
Analysis of Matched Tumor and Normal Profiles Reveals Common Transcriptional and Epigenetic Signals Shared across Cancer Types.配对肿瘤与正常样本特征分析揭示了不同癌症类型共有的常见转录和表观遗传信号。
PLoS One. 2015 Nov 10;10(11):e0142618. doi: 10.1371/journal.pone.0142618. eCollection 2015.
6
CAMUR: Knowledge extraction from RNA-seq cancer data through equivalent classification rules.CAMUR:通过等效分类规则从RNA测序癌症数据中提取知识。
Bioinformatics. 2016 Mar 1;32(5):697-704. doi: 10.1093/bioinformatics/btv635. Epub 2015 Oct 30.
7
Proteomics. Tissue-based map of the human proteome.蛋白质组学。人类蛋白质组组织图谱。
Science. 2015 Jan 23;347(6220):1260419. doi: 10.1126/science.1260419.
8
Next generation sequencing reads comparison with an alignment-free distance.使用无比对距离的下一代测序读数比较
BMC Res Notes. 2014 Dec 3;7:869. doi: 10.1186/1756-0500-7-869.
9
Supervised DNA Barcodes species classification: analysis, comparisons and results.有监督 DNA 条形码物种分类:分析、比较和结果。
BioData Min. 2014 Apr 11;7(1):4. doi: 10.1186/1756-0381-7-4.
10
Technology: The $1,000 genome.技术:千美元基因组。
Nature. 2014 Mar 20;507(7492):294-5. doi: 10.1038/507294a.