从时间序列比较基因表达数据集中挖掘差异的 top-k 共表达模式。

Mining differential top-k co-expression patterns from time course comparative gene expression datasets.

机构信息

Department of Computer Science and Information Engineering, National Cheng Kung University, No. 1, University Road, Tainan City 701, Taiwan R.O.C.

出版信息

BMC Bioinformatics. 2013 Jul 21;14:230. doi: 10.1186/1471-2105-14-230.

DOI:10.1186/1471-2105-14-230

PMID:23870110

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3751367/

Abstract

BACKGROUND

Frequent pattern mining analysis applied on microarray dataset appears to be a promising strategy for identifying relationships between gene expression levels. Unfortunately, too many itemsets (co-expressed genes) are identified by this analysis method since it does not consider the importance of each gene within biological processes to a cellular response and does not take into account temporal properties under biological treatment-control matched conditions in a microarray dataset.

RESULTS

We propose a method termed TIIM (Top-k Impactful Itemsets Miner), which only requires specifying a user-defined number k to explore the top k itemsets with the most significantly differentially co-expressed genes between 2 conditions in a time course. To give genes different weights, a table with impact degrees for each gene was constructed based on the number of neighboring genes that are differently expressed in the dataset within gene regulatory networks. Finally, the resulting top-k impactful itemsets were manually evaluated using previous literature and analyzed by a Gene Ontology enrichment method.

CONCLUSIONS

In this study, the proposed method was evaluated in 2 publicly available time course microarray datasets with 2 different experimental conditions. Both datasets identified potential itemsets with co-expressed genes evaluated from the literature and showed higher accuracies compared to the 2 corresponding control methods: i) performing TIIM without considering the gene expression differentiation between 2 different experimental conditions and impact degrees, and ii) performing TIIM with a constant impact degree for each gene. Our proposed method found that several new gene regulations involved in these itemsets were useful for biologists and provided further insights into the mechanisms underpinning biological processes. The Java source code and other related materials used in this study are available at "http://websystem.csie.ncku.edu.tw/TIIM_Program.rar".

摘要

背景

在微阵列数据集上应用频繁模式挖掘分析似乎是识别基因表达水平之间关系的一种很有前途的策略。不幸的是，由于该分析方法不考虑基因在细胞反应中对生物过程的重要性，也不考虑微阵列数据集中生物处理对照匹配条件下的时间特性，因此会识别出太多的项集（共表达基因）。

结果

我们提出了一种称为 TIIM（Top-k Impactful Itemsets Miner）的方法，该方法仅需要指定用户定义的数量 k，即可在时间序列中探索 2 种条件之间具有最显著差异共表达基因的前 k 个项集。为了给基因赋予不同的权重，根据基因调控网络中数据集内差异表达的相邻基因的数量，构建了一个包含每个基因影响程度的表。最后，使用先前的文献手动评估生成的 top-k 影响项集，并通过基因本体论富集方法进行分析。

结论

在这项研究中，使用 2 个具有 2 种不同实验条件的公开可用的时间序列微阵列数据集对所提出的方法进行了评估。这两个数据集都从文献中评估了具有共表达基因的潜在项集，并且与 2 个相应的对照方法相比，准确性更高：i）不考虑 2 种不同实验条件和影响程度的情况下执行 TIIM，ii）为每个基因执行 TIIM 时使用恒定的影响程度。我们提出的方法发现，这些项集中涉及的几个新的基因调控对生物学家很有用，并为理解生物过程背后的机制提供了更深入的见解。本研究中使用的 Java 源代码和其他相关材料可在“http://websystem.csie.ncku.edu.tw/TIIM_Program.rar”处获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/3ee80eafea36/1471-2105-14-230-1.jpg

相似文献

Mining differential top-k co-expression patterns from time course comparative gene expression datasets.

BMC Bioinformatics. 2013 Jul 21;14:230. doi: 10.1186/1471-2105-14-230.

An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets.

BMC Bioinformatics. 2013;14 Suppl 12(Suppl 12):S3. doi: 10.1186/1471-2105-14-S12-S3. Epub 2013 Sep 24.

Mining significant high utility gene regulation sequential patterns.

BMC Syst Biol. 2017 Dec 14;11(Suppl 6):109. doi: 10.1186/s12918-017-0475-4.

Differential regulation enrichment analysis via the integration of transcriptional regulatory network and gene expression data.

Bioinformatics. 2015 Feb 15;31(4):563-71. doi: 10.1093/bioinformatics/btu672. Epub 2014 Oct 15.

Efficient Top-K Identical Frequent Itemsets Mining without Support Threshold Parameter from Transactional Datasets Produced by IoT-Based Smart Shopping Carts.

Sensors (Basel). 2022 Oct 21;22(20):8063. doi: 10.3390/s22208063.

Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes.

BMC Bioinformatics. 2015 Jun 26;16:200. doi: 10.1186/s12859-015-0635-8.

TKFIM: Top-K frequent itemset mining technique based on equivalence classes.

PeerJ Comput Sci. 2021 Mar 8;7:e385. doi: 10.7717/peerj-cs.385. eCollection 2021.

Exploring matrix factorization techniques for significant genes identification of Alzheimer's disease microarray gene expression data.

BMC Bioinformatics. 2011;12 Suppl 5(Suppl 5):S7. doi: 10.1186/1471-2105-12-S5-S7. Epub 2011 Jul 27.

Atlas of signaling for interpretation of microarray experiments.

PLoS One. 2010 Feb 17;5(2):e9256. doi: 10.1371/journal.pone.0009256.

Extracting gene expression patterns and identifying co-expressed genes from microarray data reveals biologically responsive processes.

BMC Bioinformatics. 2007 Nov 2;8:427. doi: 10.1186/1471-2105-8-427.

引用本文的文献

A hybrid multi-objective whale optimization algorithm for analyzing microarray data based on Apache Spark.

PeerJ Comput Sci. 2021 Mar 25;7:e416. doi: 10.7717/peerj-cs.416. eCollection 2021.

eXplainable Artificial Intelligence (XAI) for the identification of biologically relevant gene expression patterns in longitudinal human studies, insights from obesity research.

PLoS Comput Biol. 2020 Apr 10;16(4):e1007792. doi: 10.1371/journal.pcbi.1007792. eCollection 2020 Apr.

A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies.

BioData Min. 2018 Aug 7;11:16. doi: 10.1186/s13040-018-0178-4. eCollection 2018.

Practical Approaches for Mining Frequent Patterns in Molecular Datasets.

Bioinform Biol Insights. 2016 May 2;10:37-47. doi: 10.4137/BBI.S38419. eCollection 2016.

MiningABs: mining associated biomarkers across multi-connected gene expression datasets.

BMC Bioinformatics. 2014 Jun 8;15:173. doi: 10.1186/1471-2105-15-173.

本文引用的文献

Discovering relational-based association rules with multiple minimum supports on microarray datasets.

Bioinformatics. 2011 Nov 15;27(22):3142-8. doi: 10.1093/bioinformatics/btr526. Epub 2011 Sep 16.

Identification of temporal association rules from time-series microarray data sets.

BMC Bioinformatics. 2009 Mar 19;10 Suppl 3(Suppl 3):S6. doi: 10.1186/1471-2105-10-S3-S6.

Finding microRNA regulatory modules in human genome using rule induction.

BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S5. doi: 10.1186/1471-2105-9-S12-S5.

GenMiner: mining non-redundant association rules from integrated gene expression data and annotations.

Bioinformatics. 2008 Nov 15;24(22):2643-4. doi: 10.1093/bioinformatics/btn490. Epub 2008 Sep 17.

High confidence rule mining for microarray analysis.

IEEE/ACM Trans Comput Biol Bioinform. 2007 Oct-Dec;4(4):611-623. doi: 10.1109/tcbb.2007.1050.

Mining frequent patterns for AMP-activated protein kinase regulation on skeletal muscle.

BMC Bioinformatics. 2006 Aug 30;7:394. doi: 10.1186/1471-2105-7-394.

Noise and bias in microarray analysis of tumor specimens.

J Clin Oncol. 2006 Aug 10;24(23):3719-21. doi: 10.1200/JCO.2006.06.7942. Epub 2006 Jul 5.

Acetylation of p53 at lysine 373/382 by the histone deacetylase inhibitor depsipeptide induces expression of p21(Waf1/Cip1).

Mol Cell Biol. 2006 Apr;26(7):2782-90. doi: 10.1128/MCB.26.7.2782-2790.2006.

The biological impact of the human master regulator p53 can be altered by mutations that change the spectrum and expression of its target genes.

Mol Cell Biol. 2006 Mar;26(6):2297-308. doi: 10.1128/MCB.26.6.2297-2308.2006.

Integrated analysis of gene expression by Association Rules Discovery.

BMC Bioinformatics. 2006 Feb 7;7:54. doi: 10.1186/1471-2105-7-54.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

从时间序列比较基因表达数据集中挖掘差异的 top-k 共表达模式。

Mining differential top-k co-expression patterns from time course comparative gene expression datasets.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSIONS

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献