Valverde Sergi, Vidiella Blai, Martínez-Redondo Gemma I, Duran-Nebreda Salva, Fernández Rosa, Bombarely Aureliano, Rojas Ana M, Bentley R Alexander
Evolution of Networks Lab, Institute of Evolutionary Biology (CSIC-UPF), Passeig Marítim de la Barceloneta 37-49, Barcelona 08003, Spain.
Center for the Dynamics of Social Complexity (DySoC), University of Tennesee, Knoxville, TN 37996, USA.
Mol Biol Evol. 2025 Jun 4;42(6). doi: 10.1093/molbev/msaf148.
The Gene Ontology is a central resource for representing biological knowledge, yet its internal structure is often treated as static-or as a black box-in computational analyses. Here, we examine 15 years of Gene Ontology evolution using network-based methods, revealing that Gene Ontology changes not only through incremental growth but also through punctuated, curator-driven restructuring. In particular, we document a major reorganization of the Cellular Component branch in 2019, where broad "part" terms were removed and the ontology was modularized into distinct domains for anatomical entities and protein-containing complexes. Semantic modularity aligns Gene Ontology with emerging frameworks such as the Common Anatomy Reference Ontology and Gene Ontology-Causal Activity Modeling, but also disrupts similarity metrics that rely solely on hierarchical proximity. More broadly, the restructuring of the cellular components branch consolidates a shift toward treating Gene Ontology as a multi-layer semantic network-a transformation rooted in a decade-long process of scientific and social consensus across institutions. These findings underscore the need for version-aware, multi-layer models to ensure reproducibility and interpretability-and to better represent biological function across compositional, spatial, and regulatory dimensions as ontologies continue to evolve.
基因本体论是表示生物知识的核心资源,但其内部结构在计算分析中常常被视为静态的——或者像一个黑匣子。在这里,我们使用基于网络的方法研究了15年的基因本体论演变,发现基因本体论不仅通过增量增长发生变化,还通过间断的、由策展人驱动的重组发生变化。特别是,我们记录了2019年细胞组分分支的一次重大重组,其中广泛的“部分”术语被删除,并且本体被模块化成用于解剖实体和含蛋白质复合物的不同领域。语义模块化使基因本体论与诸如通用解剖学参考本体论和基因本体论-因果活动建模等新兴框架保持一致,但也扰乱了仅依赖层次接近度的相似性度量。更广泛地说,细胞组分分支的重组巩固了将基因本体论视为多层语义网络的转变——这种转变源于跨机构长达十年的科学和社会共识过程。这些发现强调了对版本感知的多层模型的需求,以确保可重复性和可解释性——并在本体不断发展的过程中更好地表示跨组成、空间和调节维度的生物功能。