Keikhosravi Adib, Guin Krishnendu, Pegoraro Gianluca, Misteli Tom
High Throughput Imaging Facility (HiTIF), National Cancer Institute, NIH, Bethesda, MD 20892.
Cell Biology of Genomes Group, National Cancer Institute, NIH, Bethesda, MD 20892.
bioRxiv. 2025 Jan 24:2025.01.22.634320. doi: 10.1101/2025.01.22.634320.
A prominent feature of eukaryotic chromosomes are centromeres, which are specialized regions of repetitive DNA required for faithful chromosome segregation during cell division. In interphase cells centromeres are non-randomly positioned in the three-dimensional space of the nucleus in a cell-type specific manner. The functional relevance and the cellular mechanisms underlying this observation are unknown, and quantitative methods to measure distribution patterns of centromeres in 3D space are needed. Here we have developed an analytical framework that combines robust clustering metrics and advanced modeling techniques for the quantitative analysis of centromere distributions at the single cell level. To identify a robust quantitative measure for centromere clustering, we benchmarked six metrics for their ability to sensitively detect changes in centromere distribution patterns from high-throughput imaging data of human cells, both under normal conditions and upon experimental perturbation of centromere distribution. We find that Ripley's K Score has the highest accuracy with minimal sensitivity to variations in centromeres number, making it the most suitable metric for measuring centromere distributions. As a complementary approach, we also developed and validated spatial models to replicate centromere distribution patterns, and we show that a radially shifted Gaussian distribution best represents the centromere patterns seen in human cells. Our approach creates tools for the quantitative characterization of spatial centromere distributions with applications in both targeted studies of centromere organization as well as in unbiased screening approaches.
真核生物染色体的一个显著特征是着丝粒,它是细胞分裂过程中染色体准确分离所需的重复DNA的特殊区域。在间期细胞中,着丝粒以细胞类型特异性的方式在细胞核的三维空间中进行非随机定位。这一现象背后的功能相关性和细胞机制尚不清楚,因此需要定量方法来测量着丝粒在三维空间中的分布模式。在这里,我们开发了一个分析框架,该框架结合了强大的聚类指标和先进的建模技术,用于在单细胞水平上对着丝粒分布进行定量分析。为了确定一种用于着丝粒聚类的稳健定量指标,我们对六种指标进行了基准测试,评估它们从人类细胞的高通量成像数据中灵敏检测着丝粒分布模式变化的能力,这些数据涵盖正常条件以及着丝粒分布受到实验扰动的情况。我们发现,里普利K分数(Ripley's K Score)具有最高的准确性,且对着丝粒数量变化的敏感性最小,使其成为测量着丝粒分布的最合适指标。作为一种补充方法,我们还开发并验证了空间模型来复制着丝粒分布模式,并且我们表明径向偏移的高斯分布最能代表人类细胞中观察到的着丝粒模式。我们的方法为定量表征空间着丝粒分布创造了工具,可应用于着丝粒组织的靶向研究以及无偏筛选方法。