Linder Daniel F, Rempala Grzegorz A
Department of Biostatistics, Jiann-Ping Hsu College of Public Health, Georgia Southern University, P.O. Box 8015 Statesboro, GA 30460.
Division of Biostatistics, Ohio State University, Cunz Hall, 1841 Neil Ave. Columbus, OH 43210 ; Mathematical Biosciences Institute, Ohio State University, Jennings Hall, 1735 Neil Ave. Columbus, OH 43210.
J Coupled Syst Multiscale Dyn. 2013 Dec;1(4):468-475. doi: 10.1166/jcsmd.2013.1032.
With modern molecular quantification methods, like, for instance, high throughput sequencing, biologists may perform multiple complex experiments and collect longitudinal data on RNA and DNA concentrations. Such data may be then used to infer cellular level interactions between the molecular entities of interest. One method which formalizes such inference is the stoichiometric algebraic statistical model (SASM) of [2] which allows to analyze the so-called conic (or single source) networks. Despite its intuitive appeal, up until now the SASM has been only heuristically studied on few simple examples. The current paper provides a more formal mathematical treatment of the SASM, expanding the original model to a wider class of reaction systems decomposable into multiple conic subnetworks. In particular, it is proved here that on such networks the SASM enjoys the so-called sparsistency property, that is, it asymptotically (with the number of observed network trajectories) discards the false interactions by setting their reaction rates to zero. For illustration, we apply the extended SASM to in silico data from a generic decomposable network as well as to biological data from an experimental search for a possible transcription factor for the heat shock protein 70 (Hsp70) in the zebrafish retina.
借助现代分子定量方法,例如高通量测序,生物学家可以进行多个复杂实验,并收集有关RNA和DNA浓度的纵向数据。然后,这些数据可用于推断感兴趣的分子实体之间的细胞水平相互作用。一种将这种推断形式化的方法是[2]中的化学计量代数统计模型(SASM),它允许分析所谓的圆锥(或单源)网络。尽管其具有直观吸引力,但到目前为止,SASM仅在少数简单示例上进行了启发式研究。本文对SASM进行了更正式的数学处理,将原始模型扩展到更广泛的一类可分解为多个圆锥子网的反应系统。特别地,本文证明了在这样的网络上,SASM具有所谓的稀疏一致性属性,即它渐近地(随着观察到的网络轨迹数量)通过将错误相互作用的反应速率设置为零来舍弃它们。为了说明,我们将扩展的SASM应用于来自通用可分解网络的计算机模拟数据以及来自斑马鱼视网膜中热休克蛋白70(Hsp70)可能转录因子的实验搜索的生物学数据。