• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Simcluster:在单纯形空间上对基因表达数据进行聚类枚举

Simcluster: clustering enumeration gene expression data on the simplex space.

作者信息

Vêncio Ricardo Z N, Varuzza Leonardo, de B Pereira Carlos A, Brentani Helena, Shmulevich Ilya

机构信息

Institute for Systems Biology, 1441 North 34th street, Seattle, WA 98103-8904, USA.

出版信息

BMC Bioinformatics. 2007 Jul 11;8:246. doi: 10.1186/1471-2105-8-246.

DOI:10.1186/1471-2105-8-246
PMID:17625017
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2147035/
Abstract

BACKGROUND

Transcript enumeration methods such as SAGE, MPSS, and sequencing-by-synthesis EST "digital northern", are important high-throughput techniques for digital gene expression measurement. As other counting or voting processes, these measurements constitute compositional data exhibiting properties particular to the simplex space where the summation of the components is constrained. These properties are not present on regular Euclidean spaces, on which hybridization-based microarray data is often modeled. Therefore, pattern recognition methods commonly used for microarray data analysis may be non-informative for the data generated by transcript enumeration techniques since they ignore certain fundamental properties of this space.

RESULTS

Here we present a software tool, Simcluster, designed to perform clustering analysis for data on the simplex space. We present Simcluster as a stand-alone command-line C package and as a user-friendly on-line tool. Both versions are available at: http://xerad.systemsbiology.net/simcluster.

CONCLUSION

Simcluster is designed in accordance with a well-established mathematical framework for compositional data analysis, which provides principled procedures for dealing with the simplex space, and is thus applicable in a number of contexts, including enumeration-based gene expression data.

摘要

背景

诸如基因表达序列分析(SAGE)、大规模平行信号测序系统(MPSS)以及基于合成测序的表达序列标签“数字北方”等转录本计数方法,是用于数字基因表达测量的重要高通量技术。与其他计数或投票过程一样,这些测量构成了成分数据,展现出单纯形空间特有的性质,即各成分之和受到限制。这些性质在常规欧几里得空间中不存在,而基于杂交的微阵列数据通常在该空间中建模。因此,常用于微阵列数据分析的模式识别方法对于转录本计数技术生成的数据可能并无信息价值,因为它们忽略了该空间的某些基本性质。

结果

在此我们展示一款软件工具Simcluster,旨在对单纯形空间上的数据进行聚类分析。我们将Simcluster呈现为一个独立的命令行C程序包以及一个用户友好的在线工具。两个版本均可在以下网址获取:http://xerad.systemsbiology.net/simcluster。

结论

Simcluster是依据一个成熟的成分数据分析数学框架设计的,该框架为处理单纯形空间提供了有原则的程序,因而适用于多种情形,包括基于计数的基因表达数据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/c67f64ab8a86/1471-2105-8-246-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/2a1413cf96dc/1471-2105-8-246-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/042f2b4ab3c3/1471-2105-8-246-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/e93cb07e1755/1471-2105-8-246-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/6f0134b1d369/1471-2105-8-246-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/aeaa2513481a/1471-2105-8-246-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/c67f64ab8a86/1471-2105-8-246-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/2a1413cf96dc/1471-2105-8-246-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/042f2b4ab3c3/1471-2105-8-246-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/e93cb07e1755/1471-2105-8-246-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/6f0134b1d369/1471-2105-8-246-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/aeaa2513481a/1471-2105-8-246-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b552/2147035/c67f64ab8a86/1471-2105-8-246-6.jpg

相似文献

1
Simcluster: clustering enumeration gene expression data on the simplex space.Simcluster:在单纯形空间上对基因表达数据进行聚类枚举
BMC Bioinformatics. 2007 Jul 11;8:246. doi: 10.1186/1471-2105-8-246.
2
TimeClust: a clustering tool for gene expression time series.TimeClust:一种用于基因表达时间序列的聚类工具。
Bioinformatics. 2008 Feb 1;24(3):430-2. doi: 10.1093/bioinformatics/btm605. Epub 2007 Dec 6.
3
Automated Microarray Image Analysis Toolbox for MATLAB.用于MATLAB的自动化微阵列图像分析工具箱。
Bioinformatics. 2005 Sep 1;21(17):3578-9. doi: 10.1093/bioinformatics/bti576. Epub 2005 Jul 26.
4
Mosclust: a software library for discovering significant structures in bio-molecular data.Mosclust:一个用于发现生物分子数据中显著结构的软件库。
Bioinformatics. 2007 Feb 1;23(3):387-9. doi: 10.1093/bioinformatics/btl600. Epub 2006 Nov 24.
5
Maximum significance clustering of oligonucleotide microarrays.寡核苷酸微阵列的最大显著性聚类
Bioinformatics. 2006 Feb 1;22(3):326-31. doi: 10.1093/bioinformatics/bti788. Epub 2005 Nov 22.
6
Inferential clustering approach for microarray experiments with replicated measurements.具有重复测量的微阵列实验的推断聚类方法。
IEEE/ACM Trans Comput Biol Bioinform. 2009 Oct-Dec;6(4):594-604. doi: 10.1109/TCBB.2008.106.
7
SScore: an R package for detecting differential gene expression without gene expression summaries.SScore:一个用于在无基因表达汇总情况下检测差异基因表达的R软件包。
Bioinformatics. 2006 May 15;22(10):1272-4. doi: 10.1093/bioinformatics/btl108. Epub 2006 Mar 30.
8
How does gene expression clustering work?基因表达聚类是如何工作的?
Nat Biotechnol. 2005 Dec;23(12):1499-501. doi: 10.1038/nbt1205-1499.
9
Unsupervised pattern recognition: an introduction to the whys and wherefores of clustering microarray data.无监督模式识别:聚类微阵列数据的原理与应用介绍
Brief Bioinform. 2005 Dec;6(4):331-43. doi: 10.1093/bib/6.4.331.
10
cluML: A markup language for clustering and cluster validity assessment of microarray data.cluML:一种用于微阵列数据聚类及聚类有效性评估的标记语言。
Appl Bioinformatics. 2005;4(3):211-3. doi: 10.2165/00822942-200504030-00006.

引用本文的文献

1
Post-transcriptional Mechanisms Contribute Little to Phenotypic Variation in Snake Venoms.转录后机制对蛇毒表型变异的贡献很小。
G3 (Bethesda). 2015 Sep 9;5(11):2375-82. doi: 10.1534/g3.115.020578.
2
Proportionality: a valid alternative to correlation for relative data.比例性:相对数据相关性的有效替代方法。
PLoS Comput Biol. 2015 Mar 16;11(3):e1004075. doi: 10.1371/journal.pcbi.1004075. eCollection 2015 Mar.
3
The genesis of an exceptionally lethal venom in the timber rattlesnake (Crotalus horridus) revealed through comparative venom-gland transcriptomics.

本文引用的文献

1
Metric for measuring the effectiveness of clustering of DNA microarray expression.用于测量 DNA 微阵列表达聚类有效性的度量。
BMC Bioinformatics. 2006 Sep 6;7 Suppl 2(Suppl 2):S5. doi: 10.1186/1471-2105-7-S2-S5.
2
Analysis of the prostate cancer cell line LNCaP transcriptome using a sequencing-by-synthesis approach.采用合成测序方法对前列腺癌细胞系LNCaP转录组进行分析。
BMC Genomics. 2006 Sep 29;7:246. doi: 10.1186/1471-2164-7-246.
3
Methods for evaluating clustering algorithms for gene expression data using a reference set of functional classes.
通过比较毒液腺转录组学揭示了美洲响尾蛇(Crotalus horridus)中一种异常致命毒液的起源。
BMC Genomics. 2013 Jun 12;14:394. doi: 10.1186/1471-2164-14-394.
4
An atlas of bovine gene expression reveals novel distinctive tissue characteristics and evidence for improving genome annotation.牛基因表达图谱揭示了新的独特组织特征,并为改进基因组注释提供了证据。
Genome Biol. 2010;11(10):R102. doi: 10.1186/gb-2010-11-10-r102. Epub 2010 Oct 20.
5
MediPlEx - a tool to combine in silico & experimental gene expression profiles of the model legume Medicago truncatula.MediPlEx——一种整合豆科模式植物蒺藜苜蓿的计算机模拟和实验基因表达谱的工具。
BMC Res Notes. 2010 Oct 19;3:262. doi: 10.1186/1756-0500-3-262.
6
A score system for quality evaluation of RNA sequence tags: an improvement for gene expression profiling.一种用于RNA序列标签质量评估的评分系统:基因表达谱分析的改进方法
BMC Bioinformatics. 2009 Jun 6;10:170. doi: 10.1186/1471-2105-10-170.
7
Clustering-based approaches to SAGE data mining.基于聚类的 SAGE 数据挖掘方法。
BioData Min. 2008 Jul 17;1(1):5. doi: 10.1186/1756-0381-1-5.
使用功能类别参考集评估基因表达数据聚类算法的方法。
BMC Bioinformatics. 2006 Aug 31;7:397. doi: 10.1186/1471-2105-7-397.
4
Systems biology approaches identify ATF3 as a negative regulator of Toll-like receptor 4.系统生物学方法确定ATF3为Toll样受体4的负调控因子。
Nature. 2006 May 11;441(7090):173-8. doi: 10.1038/nature04768.
5
Modeling Sage data with a truncated gamma-Poisson model.使用截断伽马-泊松模型对Sage数据进行建模。
BMC Bioinformatics. 2006 Mar 20;7:157. doi: 10.1186/1471-2105-7-157.
6
Gene sequencing. The race for the $1000 genome.基因测序。千元基因组竞赛。
Science. 2006 Mar 17;311(5767):1544-6. doi: 10.1126/science.311.5767.1544.
7
Genome sequencing in microfabricated high-density picolitre reactors.微制造高密度皮升反应器中的基因组测序
Nature. 2005 Sep 15;437(7057):376-80. doi: 10.1038/nature03959. Epub 2005 Jul 31.
8
Four-color DNA sequencing by synthesis on a chip using photocleavable fluorescent nucleotides.利用光可裂解荧光核苷酸在芯片上通过合成进行四色DNA测序。
Proc Natl Acad Sci U S A. 2005 Apr 26;102(17):5926-31. doi: 10.1073/pnas.0501965102. Epub 2005 Apr 13.
9
Statistical analysis of MPSS measurements: application to the study of LPS-activated macrophage gene expression.MPSS测量的统计分析:在脂多糖激活的巨噬细胞基因表达研究中的应用。
Proc Natl Acad Sci U S A. 2005 Feb 1;102(5):1402-7. doi: 10.1073/pnas.0406555102. Epub 2005 Jan 24.
10
An integrated tool for microarray data clustering and cluster validity assessment.一种用于微阵列数据聚类和聚类有效性评估的集成工具。
Bioinformatics. 2005 Feb 15;21(4):451-5. doi: 10.1093/bioinformatics/bti190. Epub 2004 Dec 17.