Department of Statistics, University of Missouri, Columbia, Missouri, USA.
Department of Nutrition, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA.
Stat Med. 2024 Dec 10;43(28):5446-5460. doi: 10.1002/sim.10261. Epub 2024 Oct 24.
Clusters of similar or dissimilar objects are encountered in many fields. Frequently used approaches treat each cluster's central object as latent. Yet, often objects of one or more types cluster around objects of another type. Such arrangements are common in biomedical images of cells, in which nearby cell types likely interact. Quantifying spatial relationships may elucidate biological mechanisms. Parent-offspring statistical frameworks can be usefully applied even when central objects ("parents") differ from peripheral ones ("offspring"). We propose the novel multivariate cluster point process (MCPP) to quantify multi-object (e.g., multi-cellular) arrangements. Unlike commonly used approaches, the MCPP exploits locations of the central parent object in clusters. It accounts for possibly multilayered, multivariate clustering. The model formulation requires specification of which object types function as cluster centers and which reside peripherally. If such information is unknown, the relative roles of object types may be explored by comparing fit of different models via the deviance information criterion (DIC). In simulated data, we compared a series of models' DIC; the MCPP correctly identified simulated relationships. It also produced more accurate and precise parameter estimates than the classical univariate Neyman-Scott process model. We also used the MCPP to quantify proposed configurations and explore new ones in human dental plaque biofilm image data. MCPP models quantified simultaneous clustering of Streptococcus and Porphyromonas around Corynebacterium and of Pasteurellaceae around Streptococcus and successfully captured hypothesized structures for all taxa. Further exploration suggested the presence of clustering between Fusobacterium and Leptotrichia, a previously unreported relationship.
在许多领域都会遇到相似或不同的物体簇。常用的方法将每个簇的中心物体视为潜在的。然而,一个或多个类型的物体通常会聚集在另一个类型的物体周围。这种排列在细胞的生物医学图像中很常见,其中附近的细胞类型可能会相互作用。量化空间关系可以阐明生物学机制。即使中心物体(“父母”)与外围物体(“子女”)不同,父母-子女统计框架也可以很好地应用。我们提出了新的多元聚类点过程(MCPP)来量化多物体(例如多细胞)排列。与常用的方法不同,MCPP 利用了簇中中心父对象的位置。它解释了可能的多层、多元聚类。模型公式需要指定哪些物体类型作为聚类中心,哪些位于外围。如果不知道这些信息,可以通过比较不同模型的拟合度来通过偏差信息准则(DIC)来探索物体类型的相对作用。在模拟数据中,我们比较了一系列模型的 DIC;MCPP 正确识别了模拟关系。它还产生了比经典单变量 Neyman-Scott 过程模型更准确和精确的参数估计。我们还使用 MCPP 来量化人类牙菌斑生物膜图像数据中的提议配置并探索新的配置。MCPP 模型量化了链球菌和卟啉单胞菌围绕棒状杆菌的同时聚类,以及巴斯德氏菌科围绕链球菌的同时聚类,并成功捕获了所有分类单元的假设结构。进一步的探索表明,在以前未报道的关系中,梭杆菌属和 Leptotrichia 之间存在聚类。