Genetics. 2024 May 7;227(1). doi: 10.1093/genetics/iyae049.
The Alliance of Genome Resources (Alliance) is an extensible coalition of knowledgebases focused on the genetics and genomics of intensively studied model organisms. The Alliance is organized as individual knowledge centers with strong connections to their research communities and a centralized software infrastructure, discussed here. Model organisms currently represented in the Alliance are budding yeast, Caenorhabditis elegans, Drosophila, zebrafish, frog, laboratory mouse, laboratory rat, and the Gene Ontology Consortium. The project is in a rapid development phase to harmonize knowledge, store it, analyze it, and present it to the community through a web portal, direct downloads, and application programming interfaces (APIs). Here, we focus on developments over the last 2 years. Specifically, we added and enhanced tools for browsing the genome (JBrowse), downloading sequences, mining complex data (AllianceMine), visualizing pathways, full-text searching of the literature (Textpresso), and sequence similarity searching (SequenceServer). We enhanced existing interactive data tables and added an interactive table of paralogs to complement our representation of orthology. To support individual model organism communities, we implemented species-specific "landing pages" and will add disease-specific portals soon; in addition, we support a common community forum implemented in Discourse software. We describe our progress toward a central persistent database to support curation, the data modeling that underpins harmonization, and progress toward a state-of-the-art literature curation system with integrated artificial intelligence and machine learning (AI/ML).
基因组资源联盟(Alliance)是一个扩展的知识库联盟,专注于经过深入研究的模式生物的遗传学和基因组学。该联盟由各个知识中心组成,与他们的研究社区有很强的联系,并拥有集中的软件基础设施,本文将对此进行讨论。目前,联盟中包含的模式生物有 budding yeast、Caenorhabditis elegans、Drosophila、zebrafish、frog、laboratory mouse、laboratory rat 和 Gene Ontology Consortium。该项目正处于快速发展阶段,旨在协调知识、存储知识、分析知识,并通过门户网站、直接下载和应用程序编程接口 (API) 将其呈现给社区。在这里,我们重点介绍过去两年的发展情况。具体来说,我们添加和增强了用于浏览基因组 (JBrowse)、下载序列、挖掘复杂数据 (AllianceMine)、可视化途径、全文搜索文献 (Textpresso) 和序列相似性搜索 (SequenceServer) 的工具。我们增强了现有的交互式数据表,并添加了一个交互式同源基因对数据表,以补充我们对直系同源的表示。为了支持各个模式生物社区,我们实现了特定于物种的“登录页面”,并将很快添加针对特定疾病的门户;此外,我们还支持在 Discourse 软件中实现的通用社区论坛。我们描述了在中央持久数据库方面的进展,以支持策展工作,以及为实现具有集成人工智能和机器学习 (AI/ML) 的最新文献策展系统而进行的数据建模方面的进展。