Li Xiaomei, Whan Alex, McNeil Meredith, Starns David, Irons Jessica, Andrew Samuel C, Suchecki Rad
Agriculture and Food, CSIRO, 26 Pembroke Road, Marsfield, NSW 2122, Australia.
Agriculture and Food, CSIRO, 2-40 Clunies Ross Street, Acton, ACT 2601, Australia.
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf377.
Genome annotation is essential for understanding the functional elements within genomes. While automated methods are indispensable for processing large-scale genomic data, they often face challenges in accurately predicting gene structures and functions. Consequently, manual curation by domain experts remains crucial for validating and refining these predictions. These combined outcomes from automated tools and manual curation highlight the importance of integrating human expertise with artificial intelligence (AI) capabilities to improve both the accuracy and efficiency of genome annotation. However, the manual curation process is inherently labor-intensive and time-consuming, making it difficult to scale for large datasets. To address these challenges, we propose a conceptual framework, Human-AI Collaborative Genome Annotation (HAICoGA), that leverages the synergistic partnership between humans and AI to enhance human capabilities and accelerate the genome annotation process. Additionally, we explore the potential of integrating large language models into this framework to support and augment specific tasks. Finally, we discuss emerging challenges and outline open research questions to guide further exploration in this area.
基因组注释对于理解基因组中的功能元件至关重要。虽然自动化方法对于处理大规模基因组数据不可或缺,但它们在准确预测基因结构和功能方面常常面临挑战。因此,领域专家的人工审核对于验证和完善这些预测仍然至关重要。自动化工具和人工审核的这些综合成果凸显了将人类专业知识与人工智能(AI)能力相结合以提高基因组注释的准确性和效率的重要性。然而,人工审核过程本质上劳动强度大且耗时,难以对大型数据集进行扩展。为应对这些挑战,我们提出了一个概念框架,即人机协作基因组注释(HAICoGA),该框架利用人类与AI之间的协同伙伴关系来增强人类能力并加速基因组注释过程。此外,我们探索将大语言模型集成到该框架中的潜力,以支持和增强特定任务。最后,我们讨论了新出现的挑战并概述了开放性研究问题,以指导该领域的进一步探索。