• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从时间序列比较基因表达数据集中挖掘差异的 top-k 共表达模式。

Mining differential top-k co-expression patterns from time course comparative gene expression datasets.

机构信息

Department of Computer Science and Information Engineering, National Cheng Kung University, No. 1, University Road, Tainan City 701, Taiwan R.O.C.

出版信息

BMC Bioinformatics. 2013 Jul 21;14:230. doi: 10.1186/1471-2105-14-230.

DOI:10.1186/1471-2105-14-230
PMID:23870110
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3751367/
Abstract

BACKGROUND

Frequent pattern mining analysis applied on microarray dataset appears to be a promising strategy for identifying relationships between gene expression levels. Unfortunately, too many itemsets (co-expressed genes) are identified by this analysis method since it does not consider the importance of each gene within biological processes to a cellular response and does not take into account temporal properties under biological treatment-control matched conditions in a microarray dataset.

RESULTS

We propose a method termed TIIM (Top-k Impactful Itemsets Miner), which only requires specifying a user-defined number k to explore the top k itemsets with the most significantly differentially co-expressed genes between 2 conditions in a time course. To give genes different weights, a table with impact degrees for each gene was constructed based on the number of neighboring genes that are differently expressed in the dataset within gene regulatory networks. Finally, the resulting top-k impactful itemsets were manually evaluated using previous literature and analyzed by a Gene Ontology enrichment method.

CONCLUSIONS

In this study, the proposed method was evaluated in 2 publicly available time course microarray datasets with 2 different experimental conditions. Both datasets identified potential itemsets with co-expressed genes evaluated from the literature and showed higher accuracies compared to the 2 corresponding control methods: i) performing TIIM without considering the gene expression differentiation between 2 different experimental conditions and impact degrees, and ii) performing TIIM with a constant impact degree for each gene. Our proposed method found that several new gene regulations involved in these itemsets were useful for biologists and provided further insights into the mechanisms underpinning biological processes. The Java source code and other related materials used in this study are available at "http://websystem.csie.ncku.edu.tw/TIIM_Program.rar".

摘要

背景

在微阵列数据集上应用频繁模式挖掘分析似乎是识别基因表达水平之间关系的一种很有前途的策略。不幸的是,由于该分析方法不考虑基因在细胞反应中对生物过程的重要性,也不考虑微阵列数据集中生物处理对照匹配条件下的时间特性,因此会识别出太多的项集(共表达基因)。

结果

我们提出了一种称为 TIIM(Top-k Impactful Itemsets Miner)的方法,该方法仅需要指定用户定义的数量 k,即可在时间序列中探索 2 种条件之间具有最显著差异共表达基因的前 k 个项集。为了给基因赋予不同的权重,根据基因调控网络中数据集内差异表达的相邻基因的数量,构建了一个包含每个基因影响程度的表。最后,使用先前的文献手动评估生成的 top-k 影响项集,并通过基因本体论富集方法进行分析。

结论

在这项研究中,使用 2 个具有 2 种不同实验条件的公开可用的时间序列微阵列数据集对所提出的方法进行了评估。这两个数据集都从文献中评估了具有共表达基因的潜在项集,并且与 2 个相应的对照方法相比,准确性更高:i)不考虑 2 种不同实验条件和影响程度的情况下执行 TIIM,ii)为每个基因执行 TIIM 时使用恒定的影响程度。我们提出的方法发现,这些项集中涉及的几个新的基因调控对生物学家很有用,并为理解生物过程背后的机制提供了更深入的见解。本研究中使用的 Java 源代码和其他相关材料可在“http://websystem.csie.ncku.edu.tw/TIIM_Program.rar”处获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/38ebafcdf322/1471-2105-14-230-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/3ee80eafea36/1471-2105-14-230-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/e346ac531996/1471-2105-14-230-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/5061ad1c3cc8/1471-2105-14-230-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/d53ba34b95ce/1471-2105-14-230-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/62080e2dc326/1471-2105-14-230-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/68e921397179/1471-2105-14-230-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/38ebafcdf322/1471-2105-14-230-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/3ee80eafea36/1471-2105-14-230-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/e346ac531996/1471-2105-14-230-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/5061ad1c3cc8/1471-2105-14-230-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/d53ba34b95ce/1471-2105-14-230-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/62080e2dc326/1471-2105-14-230-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/68e921397179/1471-2105-14-230-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3cdf/3751367/38ebafcdf322/1471-2105-14-230-7.jpg

相似文献

1
Mining differential top-k co-expression patterns from time course comparative gene expression datasets.从时间序列比较基因表达数据集中挖掘差异的 top-k 共表达模式。
BMC Bioinformatics. 2013 Jul 21;14:230. doi: 10.1186/1471-2105-14-230.
2
An efficient method for mining cross-timepoint gene regulation sequential patterns from time course gene expression datasets.一种从时间序列基因表达数据集中挖掘跨时间点基因调控序列模式的有效方法。
BMC Bioinformatics. 2013;14 Suppl 12(Suppl 12):S3. doi: 10.1186/1471-2105-14-S12-S3. Epub 2013 Sep 24.
3
Mining significant high utility gene regulation sequential patterns.挖掘具有显著高实用性的基因调控序列模式。
BMC Syst Biol. 2017 Dec 14;11(Suppl 6):109. doi: 10.1186/s12918-017-0475-4.
4
Differential regulation enrichment analysis via the integration of transcriptional regulatory network and gene expression data.通过整合转录调控网络和基因表达数据进行差异调控富集分析。
Bioinformatics. 2015 Feb 15;31(4):563-71. doi: 10.1093/bioinformatics/btu672. Epub 2014 Oct 15.
5
Efficient Top-K Identical Frequent Itemsets Mining without Support Threshold Parameter from Transactional Datasets Produced by IoT-Based Smart Shopping Carts.从基于物联网的智能购物车生成的事务性数据集高效挖掘无支持阈值参数的 Top-K 相同频繁项集。
Sensors (Basel). 2022 Oct 21;22(20):8063. doi: 10.3390/s22208063.
6
Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes.时间序列转录组数据的多目标三聚类揭示生物过程的关键基因。
BMC Bioinformatics. 2015 Jun 26;16:200. doi: 10.1186/s12859-015-0635-8.
7
TKFIM: Top-K frequent itemset mining technique based on equivalence classes.TKFIM:基于等价类的Top-K频繁项集挖掘技术。
PeerJ Comput Sci. 2021 Mar 8;7:e385. doi: 10.7717/peerj-cs.385. eCollection 2021.
8
Exploring matrix factorization techniques for significant genes identification of Alzheimer's disease microarray gene expression data.探索矩阵分解技术在阿尔茨海默病基因表达数据中显著基因识别中的应用。
BMC Bioinformatics. 2011;12 Suppl 5(Suppl 5):S7. doi: 10.1186/1471-2105-12-S5-S7. Epub 2011 Jul 27.
9
Atlas of signaling for interpretation of microarray experiments.信号通路图谱:用于解读基因芯片实验结果
PLoS One. 2010 Feb 17;5(2):e9256. doi: 10.1371/journal.pone.0009256.
10
Extracting gene expression patterns and identifying co-expressed genes from microarray data reveals biologically responsive processes.从微阵列数据中提取基因表达模式并识别共表达基因,可揭示生物响应过程。
BMC Bioinformatics. 2007 Nov 2;8:427. doi: 10.1186/1471-2105-8-427.

引用本文的文献

1
A hybrid multi-objective whale optimization algorithm for analyzing microarray data based on Apache Spark.一种基于Apache Spark的用于分析微阵列数据的混合多目标鲸鱼优化算法。
PeerJ Comput Sci. 2021 Mar 25;7:e416. doi: 10.7717/peerj-cs.416. eCollection 2021.
2
eXplainable Artificial Intelligence (XAI) for the identification of biologically relevant gene expression patterns in longitudinal human studies, insights from obesity research.可解释人工智能(XAI)在纵向人类研究中识别与生物学相关的基因表达模式,肥胖研究的新见解。
PLoS Comput Biol. 2020 Apr 10;16(4):e1007792. doi: 10.1371/journal.pcbi.1007792. eCollection 2020 Apr.
3

本文引用的文献

1
Discovering relational-based association rules with multiple minimum supports on microarray datasets.基于微阵列数据集的多个最小支持的关系型关联规则发现。
Bioinformatics. 2011 Nov 15;27(22):3142-8. doi: 10.1093/bioinformatics/btr526. Epub 2011 Sep 16.
2
Identification of temporal association rules from time-series microarray data sets.从时间序列微阵列数据集中识别时间关联规则。
BMC Bioinformatics. 2009 Mar 19;10 Suppl 3(Suppl 3):S6. doi: 10.1186/1471-2105-10-S3-S6.
3
Finding microRNA regulatory modules in human genome using rule induction.
A multi-objective gene clustering algorithm guided by apriori biological knowledge with intensification and diversification strategies.
一种由先验生物学知识引导的多目标基因聚类算法,具备强化和多样化策略。
BioData Min. 2018 Aug 7;11:16. doi: 10.1186/s13040-018-0178-4. eCollection 2018.
4
Practical Approaches for Mining Frequent Patterns in Molecular Datasets.挖掘分子数据集频繁模式的实用方法
Bioinform Biol Insights. 2016 May 2;10:37-47. doi: 10.4137/BBI.S38419. eCollection 2016.
5
MiningABs: mining associated biomarkers across multi-connected gene expression datasets.挖掘ABs:跨多连通基因表达数据集挖掘相关生物标志物。
BMC Bioinformatics. 2014 Jun 8;15:173. doi: 10.1186/1471-2105-15-173.
运用规则归纳法在人类基因组中寻找微小RNA调控模块。
BMC Bioinformatics. 2008 Dec 12;9 Suppl 12(Suppl 12):S5. doi: 10.1186/1471-2105-9-S12-S5.
4
GenMiner: mining non-redundant association rules from integrated gene expression data and annotations.GenMiner:从整合的基因表达数据和注释中挖掘非冗余关联规则。
Bioinformatics. 2008 Nov 15;24(22):2643-4. doi: 10.1093/bioinformatics/btn490. Epub 2008 Sep 17.
5
High confidence rule mining for microarray analysis.用于微阵列分析的高置信度规则挖掘
IEEE/ACM Trans Comput Biol Bioinform. 2007 Oct-Dec;4(4):611-623. doi: 10.1109/tcbb.2007.1050.
6
Mining frequent patterns for AMP-activated protein kinase regulation on skeletal muscle.挖掘AMP激活的蛋白激酶对骨骼肌调节的频繁模式。
BMC Bioinformatics. 2006 Aug 30;7:394. doi: 10.1186/1471-2105-7-394.
7
Noise and bias in microarray analysis of tumor specimens.肿瘤标本微阵列分析中的噪声与偏差
J Clin Oncol. 2006 Aug 10;24(23):3719-21. doi: 10.1200/JCO.2006.06.7942. Epub 2006 Jul 5.
8
Acetylation of p53 at lysine 373/382 by the histone deacetylase inhibitor depsipeptide induces expression of p21(Waf1/Cip1).组蛋白去乙酰化酶抑制剂缩肽使p53的赖氨酸373/382位点发生乙酰化,从而诱导p21(Waf1/Cip1)的表达。
Mol Cell Biol. 2006 Apr;26(7):2782-90. doi: 10.1128/MCB.26.7.2782-2790.2006.
9
The biological impact of the human master regulator p53 can be altered by mutations that change the spectrum and expression of its target genes.人类主要调控因子p53的生物学影响可因改变其靶基因谱和表达的突变而发生改变。
Mol Cell Biol. 2006 Mar;26(6):2297-308. doi: 10.1128/MCB.26.6.2297-2308.2006.
10
Integrated analysis of gene expression by Association Rules Discovery.通过关联规则发现进行基因表达的综合分析。
BMC Bioinformatics. 2006 Feb 7;7:54. doi: 10.1186/1471-2105-7-54.