da Silva Samoel R M, Perrone Gabriel C, Dinis João M, de Almeida Rita M C
Instituto de Física, Universidade Federal do Rio Grande do Sul, Av, Bento Gonçalves, 9500, 91501-970 Porto Alegre, RS, Brazil.
BMC Genomics. 2014 Dec 24;15(1):1181. doi: 10.1186/1471-2164-15-1181.
Transcriptogram profiling is a method to present and analyze transcription data in a genome-wide scale that reduces noise and facilitates biological interpretation. An ordered gene list is produced, such that the probability that the genes are functionally associated exponentially decays with their distance on the list. This list presents a biological logic, evinced by the selective enrichment of successive intervals with Gene Ontology terms or KEGG pathways. Transcriptograms are expression profiles obtained by taking the average of gene expression over neighboring genes on this list. Transcriptograms enhance reproducibility and precision for expression measurements of functionally correlated gene sets.
Here we present an ordering list for Homo sapiens and apply the transcriptogram profiling method to different datasets. We show that this method enhances experiment reproducibility and enhances signal. We applied the method to a diabetes study by Hwang and collaborators, which focused on expression differences between cybrids produced by the hybridization of mitochondria of diabetes mellitus donors with osteosarcoma cell lines, depleted of mitochondria. We found that the transcriptogram method revealed significant differential expression in gene sets linked to blood coagulation and wound healing pathways, and also to gene sets that do not represent any metabolic pathway or Gene Ontology term. These gene sets are connected to ECM-receptor interaction and secreted proteins.
The transcriptogram profiling method provided an automatic way to define sets of genes with correlated expression, reduce noise in genome-wide transcription profiles, and enhance measure reproducibility and sensitivity. These advantages enabled biologic interpretation and pointed to differentially expressed gene sets in diabetes mellitus which were not previously defined.
转录图谱分析是一种在全基因组范围内呈现和分析转录数据的方法,它可以减少噪音并便于进行生物学解释。生成一个有序的基因列表,使得基因在功能上相关的概率随着它们在列表中的距离呈指数衰减。这个列表呈现出一种生物学逻辑,通过基因本体论术语或KEGG通路对连续区间的选择性富集得以证明。转录图谱是通过对该列表中相邻基因的基因表达取平均值而获得的表达谱。转录图谱提高了功能相关基因集表达测量的可重复性和精度。
在这里,我们展示了一份智人的排序列表,并将转录图谱分析方法应用于不同的数据集。我们表明,这种方法提高了实验的可重复性并增强了信号。我们将该方法应用于黄及其合作者的一项糖尿病研究中,该研究聚焦于糖尿病供体的线粒体与骨肉瘤细胞系杂交产生的胞质杂种(线粒体已耗尽)之间的表达差异。我们发现,转录图谱方法揭示了与凝血和伤口愈合途径相关的基因集中存在显著的差异表达,同时也揭示了与任何代谢途径或基因本体论术语均不相关的基因集。这些基因集与细胞外基质受体相互作用和分泌蛋白有关。
转录图谱分析方法提供了一种自动定义表达相关基因集的方法,减少了全基因组转录图谱中的噪音,并提高了测量的可重复性和灵敏度。这些优势有助于进行生物学解释,并指出了糖尿病中以前未定义的差异表达基因集。