• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从 3D 短期基因表达时间序列数据中挖掘生物信息:OPTricluster 算法。

Mining biological information from 3D short time-series gene expression data: the OPTricluster algorithm.

机构信息

Knowledge Discovery Group, Institute for Information Technology, National Research Council Canada, 1200 Montréal Road, Ottawa, ON K1A 0R6, Canada.

出版信息

BMC Bioinformatics. 2012 Apr 4;13:54. doi: 10.1186/1471-2105-13-54.

DOI:10.1186/1471-2105-13-54
PMID:22475802
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3376030/
Abstract

BACKGROUND

Nowadays, it is possible to collect expression levels of a set of genes from a set of biological samples during a series of time points. Such data have three dimensions: gene-sample-time (GST). Thus they are called 3D microarray gene expression data. To take advantage of the 3D data collected, and to fully understand the biological knowledge hidden in the GST data, novel subspace clustering algorithms have to be developed to effectively address the biological problem in the corresponding space.

RESULTS

We developed a subspace clustering algorithm called Order Preserving Triclustering (OPTricluster), for 3D short time-series data mining. OPTricluster is able to identify 3D clusters with coherent evolution from a given 3D dataset using a combinatorial approach on the sample dimension, and the order preserving (OP) concept on the time dimension. The fusion of the two methodologies allows one to study similarities and differences between samples in terms of their temporal expression profile. OPTricluster has been successfully applied to four case studies: immune response in mice infected by malaria (Plasmodium chabaudi), systemic acquired resistance in Arabidopsis thaliana, similarities and differences between inner and outer cotyledon in Brassica napus during seed development, and to Brassica napus whole seed development. These studies showed that OPTricluster is robust to noise and is able to detect the similarities and differences between biological samples.

CONCLUSIONS

Our analysis showed that OPTricluster generally outperforms other well known clustering algorithms such as the TRICLUSTER, gTRICLUSTER and K-means; it is robust to noise and can effectively mine the biological knowledge hidden in the 3D short time-series gene expression data.

摘要

背景

如今,人们可以在一系列时间点从一组生物样本中收集一组基因的表达水平。这样的数据有三个维度:基因-样本-时间(GST)。因此,它们被称为 3D 微阵列基因表达数据。为了利用收集到的 3D 数据,并充分理解 GST 数据中隐藏的生物学知识,必须开发新的子空间聚类算法,以便在相应的空间中有效地解决生物学问题。

结果

我们开发了一种称为有序保持三聚类(OPTricluster)的子空间聚类算法,用于 3D 短时间序列数据挖掘。OPTricluster 能够使用组合方法在样本维度上,以及在时间维度上的有序保持(OP)概念,从给定的 3D 数据集识别出具有一致演化的 3D 聚类。这两种方法的融合使得能够根据其时间表达谱研究样本之间的相似性和差异性。OPTricluster 已成功应用于四个案例研究:感染疟原虫(Plasmodium chabaudi)的小鼠的免疫反应、拟南芥的系统获得性抗性、油菜种子发育过程中外胚叶的相似性和差异性,以及油菜种子的整个发育过程。这些研究表明,OPTricluster 对噪声具有鲁棒性,能够检测生物样本之间的相似性和差异性。

结论

我们的分析表明,OPTricluster 通常优于其他知名聚类算法,如 TRICLUSTER、gTRICLUSTER 和 K-means;它对噪声具有鲁棒性,可以有效地挖掘 3D 短时间序列基因表达数据中隐藏的生物学知识。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/0f5d511e31e1/1471-2105-13-54-12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/f6dee6ebc813/1471-2105-13-54-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/88f85c631f16/1471-2105-13-54-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/f60cbde3672e/1471-2105-13-54-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/a0845a4e7466/1471-2105-13-54-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/d6c7d5042bc6/1471-2105-13-54-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/f935f56e6714/1471-2105-13-54-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/cb6e044e0315/1471-2105-13-54-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/02e504a0b201/1471-2105-13-54-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/f143e00a5faa/1471-2105-13-54-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/02cd8482de71/1471-2105-13-54-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/ecdf2080112c/1471-2105-13-54-11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/0f5d511e31e1/1471-2105-13-54-12.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/f6dee6ebc813/1471-2105-13-54-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/88f85c631f16/1471-2105-13-54-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/f60cbde3672e/1471-2105-13-54-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/a0845a4e7466/1471-2105-13-54-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/d6c7d5042bc6/1471-2105-13-54-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/f935f56e6714/1471-2105-13-54-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/cb6e044e0315/1471-2105-13-54-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/02e504a0b201/1471-2105-13-54-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/f143e00a5faa/1471-2105-13-54-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/02cd8482de71/1471-2105-13-54-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/ecdf2080112c/1471-2105-13-54-11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c355/3376030/0f5d511e31e1/1471-2105-13-54-12.jpg

相似文献

1
Mining biological information from 3D short time-series gene expression data: the OPTricluster algorithm.从 3D 短期基因表达时间序列数据中挖掘生物信息:OPTricluster 算法。
BMC Bioinformatics. 2012 Apr 4;13:54. doi: 10.1186/1471-2105-13-54.
2
Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes.时间序列转录组数据的多目标三聚类揭示生物过程的关键基因。
BMC Bioinformatics. 2015 Jun 26;16:200. doi: 10.1186/s12859-015-0635-8.
3
THD-Tricluster: A robust triclustering technique and its application in condition specific change analysis in HIV-1 progression data.THD-Tricluster:一种稳健的三聚类技术及其在 HIV-1 进展数据中条件特异性变化分析中的应用。
Comput Biol Chem. 2018 Aug;75:154-167. doi: 10.1016/j.compbiolchem.2018.05.007. Epub 2018 May 7.
4
Mining subspace clusters from DNA microarray data using large itemset techniques.使用大项集技术从DNA微阵列数据中挖掘子空间聚类。
J Comput Biol. 2009 May;16(5):745-68. doi: 10.1089/cmb.2008.0161.
5
Efficiently mining time-delayed gene expression patterns.高效挖掘时间延迟基因表达模式。
IEEE Trans Syst Man Cybern B Cybern. 2010 Apr;40(2):400-11. doi: 10.1109/TSMCB.2009.2025564. Epub 2009 Oct 30.
6
Mining 3D patterns from gene expression temporal data: a new tricluster evaluation measure.从基因表达时间数据中挖掘三维模式:一种新的三聚类评估方法。
ScientificWorldJournal. 2014;2014:624371. doi: 10.1155/2014/624371. Epub 2014 Mar 31.
7
Application of dynamic topic models to toxicogenomics data.动态主题模型在毒理基因组学数据中的应用。
BMC Bioinformatics. 2016 Oct 6;17(Suppl 13):368. doi: 10.1186/s12859-016-1225-0.
8
TimesVector: a vectorized clustering approach to the analysis of time series transcriptome data from multiple phenotypes.TimesVector:一种用于分析来自多种表型的时间序列转录组数据的向量化聚类方法。
Bioinformatics. 2017 Dec 1;33(23):3827-3835. doi: 10.1093/bioinformatics/btw780.
9
Triclustering method for finding biomarkers in human immunodeficiency virus-1 gene expression data.三聚类方法在人类免疫缺陷病毒-1 基因表达数据中寻找生物标志物。
Math Biosci Eng. 2022 May 5;19(7):6743-6763. doi: 10.3934/mbe.2022318.
10
Development of a Brassica seed cDNA microarray.一种芸苔属种子cDNA微阵列的开发。
Genome. 2008 Mar;51(3):236-42. doi: 10.1139/G07-115.

引用本文的文献

1
TriRNSC: triclustering of gene expression microarray data using restricted neighbourhood search.TriRNSC:基于受限邻域搜索的基因表达微阵列数据的三重聚类。
IET Syst Biol. 2020 Dec;14(6):323-333. doi: 10.1049/iet-syb.2020.0024.
2
PropaNet: Time-Varying Condition-Specific Transcriptional Network Construction by Network Propagation.PropaNet:通过网络传播构建随时间变化的特定条件转录网络
Front Plant Sci. 2019 Jun 14;10:698. doi: 10.3389/fpls.2019.00698. eCollection 2019.
3
TRIQ: a new method to evaluate triclusters.TRIQ:一种评估三聚类的新方法。

本文引用的文献

1
A general framework for analyzing data from two short time-series microarray experiments.用于分析两个短时间序列微阵列实验数据的通用框架。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):14-26. doi: 10.1109/TCBB.2009.51.
2
GOAL: a software tool for assessing biological significance of genes groups.GOAL:一个用于评估基因组合生物学意义的软件工具。
BMC Bioinformatics. 2010 May 6;11:229. doi: 10.1186/1471-2105-11-229.
3
Reverse-engineering transcription control networks.转录调控网络的反向工程。
BioData Min. 2018 Aug 6;11:15. doi: 10.1186/s13040-018-0177-5. eCollection 2018.
4
Bioinformatics identification of new targets for improving low temperature stress tolerance in spring and winter wheat.通过生物信息学鉴定提高春小麦和冬小麦低温胁迫耐受性的新靶点
BMC Bioinformatics. 2017 Mar 16;18(1):174. doi: 10.1186/s12859-017-1596-x.
5
Brain transcriptome atlases: a computational perspective.脑转录组图谱:计算视角
Brain Struct Funct. 2017 May;222(4):1557-1580. doi: 10.1007/s00429-016-1338-2. Epub 2016 Dec 1.
6
A survey of computational tools for downstream analysis of proteomic and other omic datasets.蛋白质组学及其他组学数据集下游分析的计算工具综述。
Hum Genomics. 2015 Oct 28;9:28. doi: 10.1186/s40246-015-0050-2.
7
Multiobjective triclustering of time-series transcriptome data reveals key genes of biological processes.时间序列转录组数据的多目标三聚类揭示生物过程的关键基因。
BMC Bioinformatics. 2015 Jun 26;16:200. doi: 10.1186/s12859-015-0635-8.
8
Identifying subspace gene clusters from microarray data using low-rank representation.基于低秩表示的微阵列数据子空间基因簇识别。
PLoS One. 2013;8(3):e59377. doi: 10.1371/journal.pone.0059377. Epub 2013 Mar 19.
Phys Life Rev. 2005 Mar;2(1):65-88. doi: 10.1016/j.plrev.2005.01.001.
4
Extracting biologically significant patterns from short time series gene expression data.从短期时间序列基因表达数据中提取具有生物学意义的模式。
BMC Bioinformatics. 2009 Aug 20;10:255. doi: 10.1186/1471-2105-10-255.
5
The genetic network controlling the Arabidopsis transcriptional response to Pseudomonas syringae pv. maculicola: roles of major regulators and the phytotoxin coronatine.控制拟南芥对丁香假单胞菌番茄致病变种转录反应的遗传网络:主要调控因子和植物毒素冠菌素的作用
Mol Plant Microbe Interact. 2008 Nov;21(11):1408-20. doi: 10.1094/MPMI-21-11-1408.
6
LEAFY COTYLEDON1 is a key regulator of fatty acid biosynthesis in Arabidopsis.叶状子叶1是拟南芥中脂肪酸生物合成的关键调节因子。
Plant Physiol. 2008 Oct;148(2):1042-54. doi: 10.1104/pp.108.126342. Epub 2008 Aug 8.
7
WRINKLED1 specifies the regulatory action of LEAFY COTYLEDON2 towards fatty acid metabolism during seed maturation in Arabidopsis.WRINKLED1决定了拟南芥种子成熟过程中LEAFY COTYLEDON2对脂肪酸代谢的调控作用。
Plant J. 2007 Jun;50(5):825-38. doi: 10.1111/j.1365-313X.2007.03092.x. Epub 2007 Apr 5.
8
Analysis of time-series gene expression data: methods, challenges, and opportunities.时间序列基因表达数据的分析:方法、挑战与机遇。
Annu Rev Biomed Eng. 2007;9:205-28. doi: 10.1146/annurev.bioeng.9.060906.151904.
9
The plant immune system.植物免疫系统。
Nature. 2006 Nov 16;444(7117):323-9. doi: 10.1038/nature05286.
10
Biclustering algorithms for biological data analysis: a survey.用于生物数据分析的双聚类算法:一项综述。
IEEE/ACM Trans Comput Biol Bioinform. 2004 Jan-Mar;1(1):24-45. doi: 10.1109/TCBB.2004.2.