• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于基因集功能解释和描述的数据驱动与专家驱动规则归纳及筛选框架

Data- and expert-driven rule induction and filtering framework for functional interpretation and description of gene sets.

作者信息

Gruca Aleksandra, Sikora Marek

机构信息

Institute of Informatics, Silesian University of Technology, Akademicka 16, Gliwice, 44-100, Poland.

出版信息

J Biomed Semantics. 2017 Jun 26;8(1):23. doi: 10.1186/s13326-017-0129-x.

DOI:10.1186/s13326-017-0129-x
PMID:28651634
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5483958/
Abstract

BACKGROUND

High-throughput methods in molecular biology provided researchers with abundance of experimental data that need to be interpreted in order to understand the experimental results. Manual methods of functional gene/protein group interpretation are expensive and time-consuming; therefore, there is a need to develop new efficient data mining methods and bioinformatics tools that could support the expert in the process of functional analysis of experimental results.

RESULTS

In this study, we propose a comprehensive framework for the induction of logical rules in the form of combinations of Gene Ontology (GO) terms for functional interpretation of gene sets. Within the framework, we present four approaches: the fully automated method of rule induction without filtering, rule induction method with filtering, expert-driven rule filtering method based on additive utility functions, and expert-driven rule induction method based on the so-called seed or expert terms - the GO terms of special interest which should be included into the description. These GO terms usually describe some processes or pathways of particular interest, which are related to the experiment that is being performed. During the rule induction and filtering processes such seed terms are used as a base on which the description is build.

CONCLUSION

We compare the descriptions obtained with different algorithms of rule induction and filtering and show that a filtering step is required to reduce the number of rules in the output set so that they could be analyzed by a human expert. However, filtering may remove information from the output rule set which is potentially interesting for the expert. Therefore, in the study, we present two methods that involve interaction with the expert during the process of rule induction. Both of them are able to reduce the number of rules, but only in the case of the method based on seed terms, each of the created rule includes expert terms in combination with the other terms. Further analysis of such combinations may provide new knowledge about biological processes and their combination with other pathways related to genes described by the rules. A suite of Matlab scripts that provide the functionality of a comprehensive framework for the rule induction and filtering presented in this study is available free of charge at: http://rulego.polsl.pl/framework .

摘要

背景

分子生物学中的高通量方法为研究人员提供了大量实验数据,为理解实验结果需要对这些数据进行解读。手动进行功能基因/蛋白质组解读的方法成本高且耗时;因此,需要开发新的高效数据挖掘方法和生物信息学工具,以支持专家对实验结果进行功能分析。

结果

在本研究中,我们提出了一个综合框架,用于以基因本体(GO)术语组合的形式归纳逻辑规则,以对基因集进行功能解读。在该框架内,我们提出了四种方法:无过滤的规则归纳全自动方法、带过滤的规则归纳方法、基于加性效用函数的专家驱动规则过滤方法以及基于所谓种子或专家术语(即应包含在描述中的特别感兴趣的GO术语)的专家驱动规则归纳方法。这些GO术语通常描述一些特别感兴趣的过程或途径,它们与正在进行的实验相关。在规则归纳和过滤过程中,这些种子术语用作构建描述的基础。

结论

我们比较了通过不同规则归纳和过滤算法获得的描述,结果表明需要一个过滤步骤来减少输出集中的规则数量,以便人类专家进行分析。然而,过滤可能会从输出规则集中删除对专家来说可能有潜在兴趣的信息。因此,在本研究中,我们提出了两种在规则归纳过程中涉及与专家交互的方法。它们都能够减少规则数量,但只有基于种子术语的方法,所创建的每个规则都包含专家术语与其他术语的组合。对这些组合的进一步分析可能会提供有关生物过程及其与规则所描述基因相关的其他途径组合关系的新知识。本研究中用于规则归纳和过滤综合框架功能的一套Matlab脚本可在以下网址免费获取:http://rulego.polsl.pl/framework 。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df60/5483958/dd667a9d709d/13326_2017_129_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df60/5483958/9e052795a5dc/13326_2017_129_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df60/5483958/dd667a9d709d/13326_2017_129_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df60/5483958/9e052795a5dc/13326_2017_129_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/df60/5483958/dd667a9d709d/13326_2017_129_Fig2_HTML.jpg

相似文献

1
Data- and expert-driven rule induction and filtering framework for functional interpretation and description of gene sets.用于基因集功能解释和描述的数据驱动与专家驱动规则归纳及筛选框架
J Biomed Semantics. 2017 Jun 26;8(1):23. doi: 10.1186/s13326-017-0129-x.
2
RuleGO: a logical rules-based tool for description of gene groups by means of Gene Ontology.RuleGO:一个基于逻辑规则的工具,用于通过基因本体论描述基因组。
Nucleic Acids Res. 2011 Jul;39(Web Server issue):W293-301. doi: 10.1093/nar/gkr507.
3
Extracting Cross-Ontology Weighted Association Rules from Gene Ontology Annotations.从基因本体注释中提取跨本体加权关联规则
IEEE/ACM Trans Comput Biol Bioinform. 2016 Mar-Apr;13(2):197-208. doi: 10.1109/TCBB.2015.2462348.
4
Evaluation of BioCreAtIvE assessment of task 2.生物创意任务2评估的评价
BMC Bioinformatics. 2005;6 Suppl 1(Suppl 1):S16. doi: 10.1186/1471-2105-6-S1-S16. Epub 2005 May 24.
5
A drug target slim: using gene ontology and gene ontology annotations to navigate protein-ligand target space in ChEMBL.药物靶点精简:利用基因本体论和基因本体注释在ChEMBL中探索蛋白质-配体靶点空间
J Biomed Semantics. 2016 Sep 27;7(1):59. doi: 10.1186/s13326-016-0102-0.
6
Interestingness measures and strategies for mining multi-ontology multi-level association rules from gene ontology annotations for the discovery of new GO relationships.从基因本体论注释中挖掘多本体多层次关联规则的有趣性度量和策略,用于发现新的 GO 关系。
J Biomed Inform. 2013 Oct;46(5):849-56. doi: 10.1016/j.jbi.2013.06.012. Epub 2013 Jul 11.
7
Using GO-WAR for mining cross-ontology weighted association rules.使用GO-WAR挖掘跨本体加权关联规则。
Comput Methods Programs Biomed. 2015 Jul;120(2):113-22. doi: 10.1016/j.cmpb.2015.03.007. Epub 2015 Apr 17.
8
Cross-Ontology multi-level association rule mining in the Gene Ontology.在本体论中进行跨本体多层次关联规则挖掘。
PLoS One. 2012;7(10):e47411. doi: 10.1371/journal.pone.0047411. Epub 2012 Oct 12.
9
LEMRG: Decision Rule Generation Algorithm for Mining MicroRNA Expression Data.LEMRG:用于挖掘微小RNA表达数据的决策规则生成算法
Adv Exp Med Biol. 2017;1028:105-137. doi: 10.1007/978-981-10-6041-0_7.
10
A method for knowledge acquisition in diagnostic expert system.一种诊断专家系统中的知识获取方法。
Technol Health Care. 2015;23 Suppl 1:S55-9. doi: 10.3233/thc-150929.

本文引用的文献

1
A multilevel pan-cancer map links gene mutations to cancer hallmarks.一张多层次泛癌图谱将基因突变与癌症特征联系起来。
Chin J Cancer. 2015 Sep 14;34(10):439-49. doi: 10.1186/s40880-015-0050-6.
2
Next-generation sequencing to guide cancer therapy.用于指导癌症治疗的下一代测序技术。
Genome Med. 2015 Jul 29;7(1):80. doi: 10.1186/s13073-015-0203-x. eCollection 2015.
3
Big Data: Astronomical or Genomical?大数据:天文学的还是基因组学的?
PLoS Biol. 2015 Jul 7;13(7):e1002195. doi: 10.1371/journal.pbio.1002195. eCollection 2015 Jul.
4
From big data analysis to personalized medicine for all: challenges and opportunities.从大数据分析到全民个性化医疗:挑战与机遇
BMC Med Genomics. 2015 Jun 27;8:33. doi: 10.1186/s12920-015-0108-y.
5
Ciruvis: a web-based tool for rule networks and interaction detection using rule-based classifiers.Ciruvis:一个基于网络的工具,用于使用基于规则的分类器进行规则网络和交互检测。
BMC Bioinformatics. 2014 May 12;15:139. doi: 10.1186/1471-2105-15-139.
6
The next-generation sequencing revolution and its impact on genomics.下一代测序革命及其对基因组学的影响。
Cell. 2013 Sep 26;155(1):27-38. doi: 10.1016/j.cell.2013.09.006.
7
GeneCodis3: a non-redundant and modular enrichment analysis tool for functional genomics.GeneCodis3:一个用于功能基因组学的非冗余和模块化富集分析工具。
Nucleic Acids Res. 2012 Jul;40(Web Server issue):W478-83. doi: 10.1093/nar/gks402. Epub 2012 May 9.
8
RuleGO: a logical rules-based tool for description of gene groups by means of Gene Ontology.RuleGO:一个基于逻辑规则的工具,用于通过基因本体论描述基因组。
Nucleic Acids Res. 2011 Jul;39(Web Server issue):W293-301. doi: 10.1093/nar/gkr507.
9
The application of next-generation sequencing technologies to drug discovery and development.下一代测序技术在药物发现和开发中的应用。
Drug Discov Today. 2011 Jun;16(11-12):512-9. doi: 10.1016/j.drudis.2011.03.006. Epub 2011 Apr 1.
10
Hallmarks of cancer: the next generation.癌症的特征:下一代。
Cell. 2011 Mar 4;144(5):646-74. doi: 10.1016/j.cell.2011.02.013.