Integrated Bioresource Information Division, RIKEN BioResource Research Center, 3-1-1 Koyadai, Tsukuba, Ibaraki, 305-0071, Japan.
Sci Rep. 2020 Mar 3;10(1):3957. doi: 10.1038/s41598-020-60891-w.
To date, reliable relationships between mammalian phenotypes, based on diagnostic test measurements, have not been reported on a large scale. The purpose of this study was to present a large mouse phenotype-phenotype relationships dataset as a reference resource, alongside detailed evaluation of the resource. We used bias-minimized comprehensive mouse phenotype data and applied association rule mining to a dataset consisting of only binary (normal and abnormal phenotypes) data to determine relationships among phenotypes. We present 3,686 evidence-based significant associations, comprising 345 phenotypes covering 60 biological systems (functions), and evaluate their characteristics in detail. To evaluate the relationships, we defined a set of phenotype-phenotype association pairs (PPAPs) as a module of phenotypic expression for each of the 345 phenotypes. By analyzing each PPAP, we identified phenotype sub-networks consisting of the largest numbers of phenotypes and distinct biological systems. Furthermore, using hierarchical clustering based on phenotype similarities among the 345 PPAPs, we identified seven community types within a putative phenome-wide association network. Moreover, to promote leverage of these data, we developed and published web-application tools. These mouse phenome-wide phenotype-phenotype association data reveal general principles of relationships among mammalian phenotypes and provide a reference resource for biomedical analyses.
迄今为止,基于诊断测试测量的哺乳动物表型之间的可靠关系尚未在大规模范围内报道。本研究的目的是提供一个大型的小鼠表型-表型关系数据集作为参考资源,并对该资源进行详细评估。我们使用偏置最小化的综合小鼠表型数据,并应用关联规则挖掘,对仅由二进制(正常和异常表型)数据组成的数据集进行分析,以确定表型之间的关系。我们提出了 3686 个基于证据的显著关联,包括 345 个表型,涵盖 60 个生物系统(功能),并详细评估了它们的特征。为了评估这些关系,我们为 345 个表型中的每一个定义了一组表型-表型关联对(PPAP)作为表型表达的模块。通过分析每个 PPAP,我们确定了由最多数量的表型和不同的生物系统组成的表型子网络。此外,通过基于 345 个 PPAP 之间的表型相似性对其进行层次聚类,我们在假定的全基因组表型关联网络中确定了七种社区类型。此外,为了促进对这些数据的利用,我们开发并发布了网络应用工具。这些小鼠表型全基因组表型-表型关联数据揭示了哺乳动物表型之间关系的一般原则,并为生物医学分析提供了参考资源。