Xiao Zhengyang, Pakrasi Himadri B, Chen Yixin, Tang Yinjie J
Department of Energy, Environment, and Chemical Engineering, Washington University in St. Louis, St. Louis, MO, 63130, United States.
Department of Biology, Washington University in St. Louis, St. Louis, MO, 63130, United States.
Metab Eng. 2025 Jan;87:60-67. doi: 10.1016/j.ymben.2024.11.006. Epub 2024 Nov 21.
Large language models (LLMs) can complete general scientific question-and-answer, yet they are constrained by their pretraining cut-off dates and lack the ability to provide specific, cited scientific knowledge. Here, we introduce Network for Knowledge Organization (NEKO), a workflow that uses LLM Qwen to extract knowledge through scientific literature text mining. When user inputs a keyword of interest, NEKO can generate knowledge graphs to link bioinformation entities and produce comprehensive summaries from PubMed search. NEKO significantly enhance LLM ability and has immediate applications in daily academic tasks such as education of young scientists, literature review, paper writing, experiment planning/troubleshooting, and new ideas/hypothesis generation. We exemplified this workflow's applicability through several case studies on yeast fermentation and cyanobacterial biorefinery. NEKO's output is more informative, specific, and actionable than GPT-4's zero-shot Q&A. NEKO offers flexible, lightweight local deployment options. NEKO democratizes artificial intelligence (AI) tools, making scientific foundation model more accessible to researchers without excessive computational power.
大语言模型(LLMs)可以完成一般的科学问答,但它们受到预训练截止日期的限制,并且缺乏提供具体的、有引用依据的科学知识的能力。在此,我们介绍知识组织网络(NEKO),这是一种使用大语言模型Qwen通过科学文献文本挖掘来提取知识的工作流程。当用户输入感兴趣的关键词时,NEKO可以生成知识图谱以连接生物信息实体,并从PubMed搜索中生成全面的摘要。NEKO显著增强了大语言模型的能力,并在日常学术任务中具有直接应用,例如青年科学家教育、文献综述、论文写作、实验规划/故障排除以及新想法/假设生成。我们通过关于酵母发酵和蓝藻生物精炼的几个案例研究,举例说明了这种工作流程的适用性。NEKO的输出比GPT-4的零样本问答更具信息性、更具体且更具可操作性。NEKO提供灵活、轻量级的本地部署选项。NEKO使人工智能(AI)工具民主化,使科研人员在没有过多计算能力的情况下也能更方便地使用科学基础模型。