LION LBD：一个基于文献的癌症生物学发现系统。

LION LBD: a literature-based discovery system for cancer biology.

机构信息

Language Technology Lab, Department of Theoretical and Applied Linguistics, University of Cambridge, Cambridge, UK.

Institute of Environmental Medicine, Karolinska Institutet, Stockholm, Sweden.

出版信息

Bioinformatics. 2019 May 1;35(9):1553-1561. doi: 10.1093/bioinformatics/bty845.

DOI:10.1093/bioinformatics/bty845

PMID:30304355

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6499247/

Abstract

MOTIVATION

The overwhelming size and rapid growth of the biomedical literature make it impossible for scientists to read all studies related to their work, potentially leading to missed connections and wasted time and resources. Literature-based discovery (LBD) aims to alleviate these issues by identifying implicit links between disjoint parts of the literature. While LBD has been studied in depth since its introduction three decades ago, there has been limited work making use of recent advances in biomedical text processing methods in LBD.

RESULTS

We present LION LBD, a literature-based discovery system that enables researchers to navigate published information and supports hypothesis generation and testing. The system is built with a particular focus on the molecular biology of cancer using state-of-the-art machine learning and natural language processing methods, including named entity recognition and grounding to domain ontologies covering a wide range of entity types and a novel approach to detecting references to the hallmarks of cancer in text. LION LBD implements a broad selection of co-occurrence based metrics for analyzing the strength of entity associations, and its design allows real-time search to discover indirect associations between entities in a database of tens of millions of publications while preserving the ability of users to explore each mention in its original context in the literature. Evaluations of the system demonstrate its ability to identify undiscovered links and rank relevant concepts highly among potential connections.

AVAILABILITY AND IMPLEMENTATION

The LION LBD system is available via a web-based user interface and a programmable API, and all components of the system are made available under open licenses from the project home page http://lbd.lionproject.net.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

生物医学文献的规模庞大且增长迅速，使得科学家不可能阅读所有与他们工作相关的研究，这可能导致错失联系和浪费时间与资源。文献基础发现（LBD）旨在通过识别文献中不相关部分之间的隐含联系来缓解这些问题。虽然自三十年前引入以来，LBD 已经进行了深入研究，但在 LBD 中利用生物医学文本处理方法的最新进展的工作却很有限。

结果

我们提出了 LION LBD，这是一个文献基础发现系统，使研究人员能够浏览已发表的信息，并支持假设的生成和测试。该系统特别关注癌症的分子生物学，使用最先进的机器学习和自然语言处理方法，包括命名实体识别和对涵盖广泛实体类型的领域本体的基础，以及一种检测文本中癌症标志的新方法。LION LBD 实现了广泛的基于共现的指标，用于分析实体关联的强度，其设计允许实时搜索在数千万篇文献的数据库中发现实体之间的间接关联，同时保留用户在文献中原始上下文中探索每个提及的能力。对该系统的评估表明，它能够识别未发现的联系，并在潜在联系中高度排名相关概念。

可用性和实现

LION LBD 系统可通过基于网络的用户界面和可编程 API 使用，系统的所有组件都可从项目主页 http://lbd.lionproject.net 获得开放许可证。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6853/6499247/fa5018dba5cd/bty845f1.jpg

相似文献

LION LBD: a literature-based discovery system for cancer biology.LION LBD：一个基于文献的癌症生物学发现系统。

Bioinformatics. 2019 May 1;35(9):1553-1561. doi: 10.1093/bioinformatics/bty845.

Serial KinderMiner (SKiM) discovers and annotates biomedical knowledge using co-occurrence and transformer models.使用共现和转换器模型，串行 KinderMiner (SKiM) 发现和注释生物医学知识。

BMC Bioinformatics. 2023 Nov 1;24(1):412. doi: 10.1186/s12859-023-05539-y.

Anatomical entity mention recognition at literature scale.文献级别的解剖实体提及识别。

Bioinformatics. 2014 Mar 15;30(6):868-75. doi: 10.1093/bioinformatics/btt580. Epub 2013 Oct 25.

Neural networks for open and closed Literature-based Discovery.基于文献的开放式和封闭式发现的神经网络。

PLoS One. 2020 May 15;15(5):e0232891. doi: 10.1371/journal.pone.0232891. eCollection 2020.

Serial KinderMiner (SKiM) Discovers and Annotates Biomedical Knowledge Using Co-Occurrence and Transformer Models.串行儿童知识挖掘器（SKiM）使用共现和变压器模型发现并注释生物医学知识。

bioRxiv. 2023 Jun 1:2023.05.30.542911. doi: 10.1101/2023.05.30.542911.

Enriched knowledge representation in biological fields: a case study of literature-based discovery in Alzheimer's disease.生物领域中丰富的知识表示：以阿尔茨海默病基于文献的发现为例

J Biomed Semantics. 2025 Mar 20;16(1):3. doi: 10.1186/s13326-025-00328-3.

Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区，服用抗叶酸抗疟药物的人群中，叶酸补充剂与疟疾易感性和严重程度的关系。

Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

Using literature-based discovery to identify novel therapeutic approaches.利用基于文献的发现来识别新的治疗方法。

Cardiovasc Hematol Agents Med Chem. 2013 Mar;11(1):14-24. doi: 10.2174/1871525711311010005.

Context-driven automatic subgraph creation for literature-based discovery.用于基于文献的发现的上下文驱动自动子图创建

J Biomed Inform. 2015 Apr;54:141-57. doi: 10.1016/j.jbi.2015.01.014. Epub 2015 Feb 7.

Constructing a Graph Database for Semantic Literature-Based Discovery.构建用于基于语义文献发现的图形数据库。

Stud Health Technol Inform. 2015;216:1094.

引用本文的文献

J Biomed Semantics. 2025 Mar 20;16(1):3. doi: 10.1186/s13326-025-00328-3.

Predicting implicit concept embeddings for singular relationship discovery replication of closed literature-based discovery.预测隐式概念嵌入以进行基于封闭文献发现的奇异关系发现复制。

Front Res Metr Anal. 2025 Mar 5;10:1509502. doi: 10.3389/frma.2025.1509502. eCollection 2025.

Learning to Rank Complex Biomedical Hypotheses for Accelerating Scientific Discovery.学习对复杂生物医学假设进行排序以加速科学发现。

Proc (IEEE Int Conf Healthc Inform). 2024 Jun;2024:285-293. doi: 10.1109/ichi61247.2024.00044. Epub 2024 Aug 22.

ENQUIRE automatically reconstructs, expands, and drives enrichment analysis of gene and Mesh co-occurrence networks from context-specific biomedical literature.ENQUIRE可根据特定背景的生物医学文献自动重建、扩展并推动基因与医学主题词（Mesh）共现网络的富集分析。

PLoS Comput Biol. 2025 Feb 11;21(2):e1012745. doi: 10.1371/journal.pcbi.1012745. eCollection 2025 Feb.

Mining impactful discoveries from the biomedical literature.从生物医学文献中挖掘有影响力的发现。

BMC Bioinformatics. 2024 Sep 16;25(1):303. doi: 10.1186/s12859-024-05881-9.

A new model construction based on the knowledge graph for mining elite polyphenotype genes in crops.一种基于知识图谱的挖掘作物精英多表型基因的新模型构建。

Front Plant Sci. 2024 Mar 20;15:1361716. doi: 10.3389/fpls.2024.1361716. eCollection 2024.

PubMed and beyond: biomedical literature search in the age of artificial intelligence.PubMed 及其以外：人工智能时代的生物医学文献检索。

EBioMedicine. 2024 Feb;100:104988. doi: 10.1016/j.ebiom.2024.104988. Epub 2024 Feb 1.

Text mining for contexts and relationships in cancer genomics literature.癌症基因组文献中的语境和关系的文本挖掘。

Bioinformatics. 2024 Jan 2;40(1). doi: 10.1093/bioinformatics/btae021.

The underuse of AI in the health sector: Opportunity costs, success stories, risks and recommendations.人工智能在医疗领域的应用不足：机会成本、成功案例、风险与建议。

Health Technol (Berl). 2024;14(1):1-14. doi: 10.1007/s12553-023-00806-7. Epub 2023 Dec 12.

BMC Bioinformatics. 2023 Nov 1;24(1):412. doi: 10.1186/s12859-023-05539-y.

本文引用的文献

Cancer Hallmarks Analytics Tool (CHAT): a text mining approach to organize and evaluate scientific literature on cancer.癌症特征分析工具（CHAT）：一种文本挖掘方法，用于组织和评估癌症相关科学文献。

Bioinformatics. 2017 Dec 15;33(24):3973-3981. doi: 10.1093/bioinformatics/btx454.

Senescent tumor cells lead the collective invasion in thyroid cancer.衰老的肿瘤细胞在甲状腺癌的集体浸润中起主导作用。

Nat Commun. 2017 May 10;8:15208. doi: 10.1038/ncomms15208.

NOTCH1 mediates a switch between two distinct secretomes during senescence.NOTCH1在衰老过程中介导两种不同分泌组之间的转换。

Nat Cell Biol. 2016 Sep;18(9):979-92. doi: 10.1038/ncb3397. Epub 2016 Aug 15.

Assessing the state of the art in biomedical relation extraction: overview of the BioCreative V chemical-disease relation (CDR) task.评估生物医学关系抽取的技术现状：生物创意V化学-疾病关系（CDR）任务概述。

Database (Oxford). 2016 Mar 19;2016. doi: 10.1093/database/baw032. Print 2016.

Bcl-2 is a critical mediator of intestinal transformation.Bcl-2是肠道转化的关键介质。

Nat Commun. 2016 Mar 9;7:10916. doi: 10.1038/ncomms10916.

Automatic semantic classification of scientific literature according to the hallmarks of cancer.根据癌症特征对科学文献进行自动语义分类。

Bioinformatics. 2016 Feb 1;32(3):432-40. doi: 10.1093/bioinformatics/btv585. Epub 2015 Oct 9.

CHEMDNER: The drugs and chemical names extraction challenge.CHEMDNER：药物和化学名称提取挑战赛。

J Cheminform. 2015 Jan 19;7(Suppl 1 Text mining for chemistry and the CHEMDNER track):S1. doi: 10.1186/1758-2946-7-S1-S1. eCollection 2015.

Integrating p38α MAPK immune signals in nonimmune cells.将 p38α MAPK 免疫信号整合到非免疫细胞中。

Sci Signal. 2015 Mar 3;8(366):fs5. doi: 10.1126/scisignal.aaa8398.

Lysophosphatidate signaling stabilizes Nrf2 and increases the expression of genes involved in drug resistance and oxidative stress responses: implications for cancer treatment.溶血磷脂酸信号传导可稳定核因子E2相关因子2（Nrf2）并增加参与耐药性和氧化应激反应的基因表达：对癌症治疗的启示。

FASEB J. 2015 Mar;29(3):772-85. doi: 10.1096/fj.14-262659. Epub 2014 Nov 14.

NCBI disease corpus: a resource for disease name recognition and concept normalization.NCBI疾病语料库：一种用于疾病名称识别和概念规范化的资源。

J Biomed Inform. 2014 Feb;47:1-10. doi: 10.1016/j.jbi.2013.12.006. Epub 2014 Jan 3.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

LION LBD：一个基于文献的癌症生物学发现系统。

LION LBD: a literature-based discovery system for cancer biology.

机构信息

出版信息

MOTIVATION

RESULTS

AVAILABILITY AND IMPLEMENTATION

SUPPLEMENTARY INFORMATION

动机

结果

可用性和实现

补充信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献