Mayburd Anatoly, Baranova Ancha
The Center of the Study of Chronic Metabolic Diseases, School of Systems Biology, College of Science, George Mason University, Fairfax, VA 22030, USA.
BMC Syst Biol. 2013 Nov 7;7:121. doi: 10.1186/1752-0509-7-121.
High-throughput profiling of human tissues typically yield as results the gene lists comprised of a mix of relevant molecular entities with multiple false positives that obstruct the translation of such results into mechanistic hypotheses. From general probabilistic considerations, gene lists distilled for the mechanistically relevant components can be far more useful for subsequent experimental design or data interpretation.
The input candidate gene lists were processed into different tiers of evidence consistency established by enrichment analysis across subsets of the same experiments and across different experiments and platforms. The cut-offs were established empirically through ontological and semantic enrichment; resultant shortened gene list was re-expanded by Ingenuity Pathway Assistant tool. The resulting sub-networks provided the basis for generating mechanistic hypotheses that were partially validated by literature search. This approach differs from previous consistency-based studies in that the cut-off on the Receiver Operating Characteristic of the true-false separation process is optimized by flexible selection of the consistency building procedure. The gene list distilled by this analytic technique and its network representation were termed Compact Disease Model (CDM). Here we present the CDM signature for the study of early-stage Alzheimer's disease. The integrated analysis of this gene signature allowed us to identify the protein traffic vesicles as prominent players in the pathogenesis of Alzheimer's. Considering the distances and complexity of protein trafficking in neurons, it is plausible that spontaneous protein misfolding along with a shortage of growth stimulation result in neurodegeneration. Several potentially overlapping scenarios of early-stage Alzheimer pathogenesis have been discussed, with an emphasis on the protective effects of AT-1 mediated antihypertensive response on cytoskeleton remodeling, along with neuronal activation of oncogenes, luteinizing hormone signaling and insulin-related growth regulation, forming a pleiotropic model of its early stages. Alignment with emerging literature confirmed many predictions derived from early-stage Alzheimer's disease' CDM.
A flexible approach for high-throughput data analysis, the Compact Disease Model generation, allows extraction of meaningful, mechanism-centered gene sets compatible with instant translation of the results into testable hypotheses.
对人体组织进行高通量分析通常会产生由相关分子实体混合而成的基因列表,其中包含多个假阳性结果,这阻碍了将此类结果转化为机制假说。从一般概率考虑,针对机制相关成分提炼出的基因列表对于后续实验设计或数据解释可能更有用。
通过对同一实验的子集以及不同实验和平台进行富集分析,将输入的候选基因列表处理为不同层次的证据一致性。通过本体和语义富集凭经验确定截断值;利用Ingenuity Pathway Assistant工具对得到的缩短基因列表进行重新扩展。得到的子网为生成机制假说提供了基础,这些假说通过文献检索得到了部分验证。这种方法与以往基于一致性的研究不同,在于通过灵活选择一致性构建程序来优化真假分离过程的受试者工作特征截断值。通过这种分析技术提炼出的基因列表及其网络表示被称为紧凑疾病模型(CDM)。在此,我们展示用于早期阿尔茨海默病研究的CDM特征。对该基因特征的综合分析使我们能够确定蛋白质运输囊泡是阿尔茨海默病发病机制中的重要参与者。考虑到神经元中蛋白质运输的距离和复杂性,自发的蛋白质错误折叠以及生长刺激的缺乏导致神经退行性变是合理的。讨论了早期阿尔茨海默病发病机制的几种可能重叠的情况,重点强调了AT - 1介导的降压反应对细胞骨架重塑的保护作用,以及癌基因的神经元激活、促黄体生成素信号传导和胰岛素相关生长调节,形成了其早期阶段的多效性模型。与新出现的文献比对证实了许多源自早期阿尔茨海默病CDM的预测。
紧凑疾病模型生成是一种灵活的高通量数据分析方法,能够提取有意义的、以机制为中心的基因集,便于将结果即时转化为可测试的假说。