CNRS, Univ Rennes, IGDR (Institut Genetics and Development of Rennes) - UMR 6290, Rennes, France.
PLoS Comput Biol. 2024 Sep 5;20(9):e1012330. doi: 10.1371/journal.pcbi.1012330. eCollection 2024 Sep.
How can inter-individual variability be quantified? Measuring many features per experiment raises the question of choosing them to recapitulate high-dimensional data. Tackling this challenge on spindle elongation phenotypes, we showed that only three typical elongation patterns describe spindle elongation in C. elegans one-cell embryo. These archetypes, automatically extracted from the experimental data using principal component analysis (PCA), accounted for more than 95% of inter-individual variability of more than 1600 experiments across more than 100 different conditions. The two first archetypes were related to spindle average length and anaphasic elongation rate. The third archetype, accounting for 6% of the variability, was novel and corresponded to a transient spindle shortening in late metaphase, reminiscent of kinetochore function-defect phenotypes. Importantly, these three archetypes were robust to the choice of the dataset and were found even considering only non-treated conditions. Thus, the inter-individual differences between genetically perturbed embryos have the same underlying nature as natural inter-individual differences between wild-type embryos, independently of the temperatures. We thus propose that beyond the apparent complexity of the spindle, only three independent mechanisms account for spindle elongation, weighted differently in the various conditions. Interestingly, the spindle-length archetypes covered both metaphase and anaphase, suggesting that spindle elongation in late metaphase is sufficient to predict the late anaphase length. We validated this idea using a machine-learning approach. Finally, given amounts of these three archetypes could represent a quantitative phenotype. To take advantage of this, we set out to predict interacting genes from a seed based on the PCA coefficients. We exemplified this firstly on the role of tpxl-1 whose homolog tpx2 is involved in spindle microtubule branching, secondly the mechanism regulating metaphase length, and thirdly the central spindle players which set the length at anaphase. We found novel interactors not in public databases but supported by recent experimental publications.
个体间的可变性如何量化?在每个实验中测量许多特征会引发选择特征来概括高维数据的问题。在纺锤体伸长表型方面,我们表明只有三种典型的伸长模式可以描述线虫单细胞胚胎的纺锤体伸长。这些原型是使用主成分分析(PCA)从实验数据中自动提取的,它们解释了超过 1600 个实验中超过 100 种不同条件下超过 1600 个实验中超过 95%的个体间变异性。前两种原型与纺锤体的平均长度和后期伸长率有关。第三种原型,占变异性的 6%,是新颖的,对应于中期后期短暂的纺锤体缩短,类似于动粒功能缺陷表型。重要的是,这三种原型对数据集的选择具有稳健性,即使仅考虑非处理条件也能找到。因此,遗传扰动胚胎之间的个体间差异与野生型胚胎之间的自然个体间差异具有相同的内在性质,而与温度无关。因此,我们提出,除了纺锤体的明显复杂性之外,只有三种独立的机制解释了纺锤体的伸长,在不同的条件下权重不同。有趣的是,纺锤体长度原型涵盖了中期和后期,这表明后期中期的纺锤体伸长足以预测后期后期的长度。我们使用机器学习方法验证了这个想法。最后,这些三种原型的数量可能代表一种定量表型。为了利用这一点,我们着手根据 PCA 系数从种子中预测相互作用的基因。我们首先举例说明了 tpxl-1 的作用,其同源物 tpx2 参与纺锤体微管分支,其次是调节中期长度的机制,以及第三是设定后期长度的中心纺锤体。我们发现了一些新的相互作用物,这些相互作用物不在公共数据库中,但得到了最近实验出版物的支持。