运用到计算神经科学中的文献自动筛选和管理。

Automating literature screening and curation with applications to computational neuroscience.

机构信息

Biostatistics, Yale School of Public Health, Yale University, New Haven, CT 06510, United States.

Integrative Genomics, Princeton University, Princeton, NJ 08540, United States.

出版信息

J Am Med Inform Assoc. 2024 Jun 20;31(7):1463-1470. doi: 10.1093/jamia/ocae097.

DOI:10.1093/jamia/ocae097

PMID:38722233

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11187430/

Abstract

OBJECTIVE

ModelDB (https://modeldb.science) is a discovery platform for computational neuroscience, containing over 1850 published model codes with standardized metadata. These codes were mainly supplied from unsolicited model author submissions, but this approach is inherently limited. For example, we estimate we have captured only around one-third of NEURON models, the most common type of models in ModelDB. To more completely characterize the state of computational neuroscience modeling work, we aim to identify works containing results derived from computational neuroscience approaches and their standardized associated metadata (eg, cell types, research topics).

MATERIALS AND METHODS

Known computational neuroscience work from ModelDB and identified neuroscience work queried from PubMed were included in our study. After pre-screening with SPECTER2 (a free document embedding method), GPT-3.5, and GPT-4 were used to identify likely computational neuroscience work and relevant metadata.

RESULTS

SPECTER2, GPT-4, and GPT-3.5 demonstrated varied but high abilities in identification of computational neuroscience work. GPT-4 achieved 96.9% accuracy and GPT-3.5 improved from 54.2% to 85.5% through instruction-tuning and Chain of Thought. GPT-4 also showed high potential in identifying relevant metadata annotations.

DISCUSSION

Accuracy in identification and extraction might further be improved by dealing with ambiguity of what are computational elements, including more information from papers (eg, Methods section), improving prompts, etc.

CONCLUSION

Natural language processing and large language model techniques can be added to ModelDB to facilitate further model discovery, and will contribute to a more standardized and comprehensive framework for establishing domain-specific resources.

摘要

目的

ModelDB（https://modeldb.science）是一个计算神经科学的发现平台，包含超过 1850 个发布的模型代码，具有标准化的元数据。这些代码主要是由未经请求的模型作者提交的，但这种方法本质上是有限的。例如，我们估计只捕获了大约三分之一的 ModelDB 中的 NEURON 模型，这是 ModelDB 中最常见的模型类型。为了更全面地描述计算神经科学建模工作的状态，我们旨在确定包含计算神经科学方法及其标准化相关元数据（例如细胞类型、研究主题）的结果的工作。

材料和方法

我们的研究包括来自 ModelDB 的已知计算神经科学工作和从 PubMed 查询到的已识别神经科学工作。在使用 SPECTER2（一种免费的文档嵌入方法）进行预筛选后，GPT-3.5 和 GPT-4 用于识别可能的计算神经科学工作和相关元数据。

结果

SPECTER2、GPT-4 和 GPT-3.5 在识别计算神经科学工作方面表现出不同但很高的能力。GPT-4 的准确率达到 96.9%，而 GPT-3.5 通过指令调整和思维链从 54.2%提高到 85.5%。GPT-4 在识别相关元数据注释方面也显示出很高的潜力。

讨论

通过处理什么是计算元素的歧义，包括从论文中获取更多信息（例如方法部分）、改进提示等，可以进一步提高识别和提取的准确性。

结论

自然语言处理和大型语言模型技术可以添加到 ModelDB 中，以促进进一步的模型发现，并为建立特定于领域的资源提供更标准化和全面的框架做出贡献。

相似文献

Automating literature screening and curation with applications to computational neuroscience.运用到计算神经科学中的文献自动筛选和管理。

J Am Med Inform Assoc. 2024 Jun 20;31(7):1463-1470. doi: 10.1093/jamia/ocae097.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

ModelDB: A Database to Support Computational Neuroscience.ModelDB：一个支持计算神经科学的数据库。

J Comput Neurosci. 2004 Jul-Aug;17(1):7-11. doi: 10.1023/B:JCNS.0000023869.22017.2e.

Twenty years of ModelDB and beyond: building essential modeling tools for the future of neuroscience.ModelDB的二十年及未来：为神经科学的未来构建重要的建模工具。

J Comput Neurosci. 2017 Feb;42(1):1-10. doi: 10.1007/s10827-016-0623-7. Epub 2016 Sep 15.

ModelView for ModelDB: Online Presentation of Model Structure.ModelDB的模型视图：模型结构的在线展示。

Neuroinformatics. 2015 Oct;13(4):459-70. doi: 10.1007/s12021-015-9269-2.

Automated Metadata Suggestion During Repository Submission.自动元数据建议在存储库提交期间。

Neuroinformatics. 2019 Jul;17(3):361-371. doi: 10.1007/s12021-018-9403-z.

Assessing the performance of generative artificial intelligence in retrieving information against manually curated genetic and genomic data.评估生成式人工智能在对照人工整理的遗传和基因组数据检索信息方面的性能。

Database (Oxford). 2025 Feb 17;2025. doi: 10.1093/database/baaf011.

Development of an information retrieval tool for biomedical patents.生物医学专利信息检索工具的开发。

Comput Methods Programs Biomed. 2018 Jun;159:125-134. doi: 10.1016/j.cmpb.2018.03.012. Epub 2018 Mar 14.

A hybrid human and machine resource curation pipeline for the Neuroscience Information Framework.神经科学信息框架的人机混合资源管理管道。

Database (Oxford). 2012 Mar 20;2012:bas005. doi: 10.1093/database/bas005. Print 2012.

GEOMetaCuration: a web-based application for accurate manual curation of Gene Expression Omnibus metadata.GEOMetaCuration：一个基于网络的应用程序，用于准确地手动整理基因表达综合数据集元数据。

Database (Oxford). 2018 Jan 1;2018. doi: 10.1093/database/bay019.

本文引用的文献

Assisted neuroscience knowledge extraction via machine learning applied to neural reconstruction metadata on NeuroMorpho.Org.通过应用于NeuroMorpho.Org上神经重建元数据的机器学习辅助神经科学知识提取。

Brain Inform. 2022 Nov 7;9(1):26. doi: 10.1186/s40708-022-00174-4.

EBRAINS Live Papers - Interactive Resource Sheets for Computational Studies in Neuroscience.EBRAINS 实时论文——神经计算研究互动资源表。

Neuroinformatics. 2023 Jan;21(1):101-113. doi: 10.1007/s12021-022-09598-z. Epub 2022 Aug 20.

Dopamine depletion can be predicted by the aperiodic component of subthalamic local field potentials.多巴胺耗竭可通过丘脑底核局部场电位的非周期性成分来预测。

Neurobiol Dis. 2022 Jun 15;168:105692. doi: 10.1016/j.nbd.2022.105692. Epub 2022 Mar 16.

Is Neuroscience FAIR? A Call for Collaborative Standardisation of Neuroscience Data.神经科学是否公平？呼吁合作标准化神经科学数据。

Neuroinformatics. 2022 Apr;20(2):507-512. doi: 10.1007/s12021-021-09557-0. Epub 2022 Jan 21.

Correction to: A Standards Organization for Open and FAIR Neuroscience: the International Neuroinformatics Coordinating Facility.更正为：《开放和FAIR神经科学的标准组织：国际神经信息学协调设施》

Neuroinformatics. 2022 Jan;20(1):37-38. doi: 10.1007/s12021-021-09522-x.

A Standards Organization for Open and FAIR Neuroscience: the International Neuroinformatics Coordinating Facility.开放与公平神经科学的标准组织：国际神经信息协调设施。

Neuroinformatics. 2022 Jan;20(1):25-36. doi: 10.1007/s12021-020-09509-0. Epub 2021 Jan 27.

Editorial: Reproducibility and Rigour in Computational Neuroscience.社论：计算神经科学中的可重复性与严谨性

Front Neuroinform. 2020 May 27;14:23. doi: 10.3389/fninf.2020.00023. eCollection 2020.

Neuron Names: A Gene- and Property-Based Name Format, With Special Reference to Cortical Neurons.神经元命名：一种基于基因和特性的命名格式，特别参考皮层神经元

Front Neuroanat. 2019 Mar 21;13:25. doi: 10.3389/fnana.2019.00025. eCollection 2019.

Automated Metadata Suggestion During Repository Submission.自动元数据建议在存储库提交期间。

Neuroinformatics. 2019 Jul;17(3):361-371. doi: 10.1007/s12021-018-9403-z.

Software for Brain Network Simulations: A Comparative Study.脑网络模拟软件：一项比较研究。

Front Neuroinform. 2017 Jul 20;11:46. doi: 10.3389/fninf.2017.00046. eCollection 2017.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验