大规模文献挖掘评估抗癌药物与癌症类型的关系。

Large-scale literature mining to assess the relation between anti-cancer drugs and cancer types.

机构信息

MicroDiscovery GmbH, Marienburger Straße 1, 10405, Berlin, Germany.

Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Ihnestraße 63, 14195, Berlin, Germany.

出版信息

J Transl Med. 2021 Jun 26;19(1):274. doi: 10.1186/s12967-021-02941-z.

DOI:10.1186/s12967-021-02941-z

PMID:34174885

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8236166/

Abstract

BACKGROUND

There is a huge body of scientific literature describing the relation between tumor types and anti-cancer drugs. The vast amount of scientific literature makes it impossible for researchers and physicians to extract all relevant information manually.

METHODS

In order to cope with the large amount of literature we applied an automated text mining approach to assess the relations between 30 most frequent cancer types and 270 anti-cancer drugs. We applied two different approaches, a classical text mining based on named entity recognition and an AI-based approach employing word embeddings. The consistency of literature mining results was validated with 3 independent methods: first, using data from FDA approvals, second, using experimentally measured IC-50 cell line data and third, using clinical patient survival data.

RESULTS

We demonstrated that the automated text mining was able to successfully assess the relation between cancer types and anti-cancer drugs. All validation methods showed a good correspondence between the results from literature mining and independent confirmatory approaches. The relation between most frequent cancer types and drugs employed for their treatment were visualized in a large heatmap. All results are accessible in an interactive web-based knowledge base using the following link: https://knowledgebase.microdiscovery.de/heatmap .

CONCLUSIONS

Our approach is able to assess the relations between compounds and cancer types in an automated manner. Both, cancer types and compounds could be grouped into different clusters. Researchers can use the interactive knowledge base to inspect the presented results and follow their own research questions, for example the identification of novel indication areas for known drugs.

摘要

背景

有大量的科学文献描述了肿瘤类型与抗癌药物之间的关系。大量的科学文献使得研究人员和医生无法手动提取所有相关信息。

方法

为了应对大量的文献，我们应用了一种自动化的文本挖掘方法来评估 30 种最常见的癌症类型和 270 种抗癌药物之间的关系。我们应用了两种不同的方法，一种是基于命名实体识别的经典文本挖掘方法，另一种是基于词向量的人工智能方法。文献挖掘结果的一致性通过 3 种独立的方法进行了验证：首先，使用来自 FDA 批准的数据，其次，使用实验测量的 IC50 细胞系数据，第三，使用临床患者生存数据。

结果

我们证明了自动化文本挖掘能够成功地评估癌症类型与抗癌药物之间的关系。所有验证方法都表明，文献挖掘结果与独立的确认方法之间存在很好的一致性。最常见的癌症类型和用于治疗这些癌症的药物之间的关系在一个大型热图中可视化。所有结果都可以在一个交互式的基于网络的知识库中使用以下链接访问：https://knowledgebase.microdiscovery.de/heatmap。

结论

我们的方法能够以自动化的方式评估化合物与癌症类型之间的关系。癌症类型和化合物都可以分为不同的簇。研究人员可以使用交互式知识库来检查所呈现的结果，并根据自己的研究问题进行跟踪，例如识别已知药物的新适应症领域。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2722/8236166/f764b431b57b/12967_2021_2941_Fig1_HTML.jpg

相似文献

Large-scale literature mining to assess the relation between anti-cancer drugs and cancer types.大规模文献挖掘评估抗癌药物与癌症类型的关系。

J Transl Med. 2021 Jun 26;19(1):274. doi: 10.1186/s12967-021-02941-z.

Text mining in livestock animal science: introducing the potential of text mining to animal sciences.文本挖掘在畜牧动物科学中的应用：介绍文本挖掘在动物科学中的应用潜力。

J Anim Sci. 2012 Oct;90(10):3666-76. doi: 10.2527/jas.2011-4841. Epub 2012 Jun 4.

PKDE4J: Entity and relation extraction for public knowledge discovery.PKDE4J：用于公共知识发现的实体与关系提取

J Biomed Inform. 2015 Oct;57:320-32. doi: 10.1016/j.jbi.2015.08.008. Epub 2015 Aug 12.

Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research.从文本和大规模数据分析中提取基因与疾病之间的关系：对转化研究的启示。

BMC Bioinformatics. 2015 Feb 21;16:55. doi: 10.1186/s12859-015-0472-9.

Knowledge based word-concept model estimation and refinement for biomedical text mining.用于生物医学文本挖掘的基于知识的词概念模型估计与优化。

J Biomed Inform. 2015 Feb;53:300-7. doi: 10.1016/j.jbi.2014.11.015. Epub 2014 Dec 12.

miRiaD: A Text Mining Tool for Detecting Associations of microRNAs with Diseases.miRiaD：一种用于检测微小RNA与疾病关联的文本挖掘工具。

J Biomed Semantics. 2016 Apr 29;7(1):9. doi: 10.1186/s13326-015-0044-y.

The Voice of Chinese Health Consumers: A Text Mining Approach to Web-Based Physician Reviews.中国医疗消费者之声：一种基于网络医生评价的文本挖掘方法。

J Med Internet Res. 2016 May 10;18(5):e108. doi: 10.2196/jmir.4430.

Relation mining experiments in the pharmacogenomics domain.药物基因组学领域的关系挖掘实验。

J Biomed Inform. 2012 Oct;45(5):851-61. doi: 10.1016/j.jbi.2012.04.014. Epub 2012 May 10.

iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature.iTextMine：用于从文献中大规模知识提取的集成文本挖掘系统。

Database (Oxford). 2018 Jan 1;2018:bay128. doi: 10.1093/database/bay128.

Bio-semantic relation extraction with attention-based external knowledge reinforcement.基于注意力的外部知识强化的生物语义关系抽取。

BMC Bioinformatics. 2020 May 24;21(1):213. doi: 10.1186/s12859-020-3540-8.

引用本文的文献

Systematic analysis of hepatotoxicity: combining literature mining and AI language models.肝毒性的系统分析：结合文献挖掘与人工智能语言模型

Front Artif Intell. 2025 Jul 21;8:1561292. doi: 10.3389/frai.2025.1561292. eCollection 2025.

Cucurbitacins as potential anticancer agents: new insights on molecular mechanisms.葫芦素作为潜在的抗癌药物：分子机制的新见解。

J Transl Med. 2022 Dec 31;20(1):630. doi: 10.1186/s12967-022-03828-3.

Global Mapping of Interventions to Improve Quality of Life of Patients with Cancer: A Protocol for Literature Mining and Meta-Analysis.全球改善癌症患者生活质量干预措施的映射：文献挖掘和荟萃分析的方案。

Int J Environ Res Public Health. 2022 Dec 2;19(23):16155. doi: 10.3390/ijerph192316155.

本文引用的文献

Drug-Drug Interactions of Irinotecan, 5-Fluorouracil, Folinic Acid and Oxaliplatin and Its Activity in Colorectal Carcinoma Treatment.伊立替康、氟尿嘧啶、亚叶酸和奥沙利铂的药物相互作用及其在结直肠癌治疗中的活性。

Molecules. 2020 Jun 4;25(11):2614. doi: 10.3390/molecules25112614.

Unsupervised word embeddings capture latent knowledge from materials science literature.无监督词嵌入方法可以从材料科学文献中提取潜在知识。

Nature. 2019 Jul;571(7763):95-98. doi: 10.1038/s41586-019-1335-8. Epub 2019 Jul 3.

BioWordVec, improving biomedical word embeddings with subword information and MeSH.BioWordVec，利用子词信息和 MeSH 改进生物医学词向量。

Sci Data. 2019 May 10;6(1):52. doi: 10.1038/s41597-019-0055-0.

BioReader: a text mining tool for performing classification of biomedical literature.BioReader：一种文本挖掘工具，用于对生物医学文献进行分类。

BMC Bioinformatics. 2019 Feb 4;19(Suppl 13):57. doi: 10.1186/s12859-019-2607-x.

Comparison of Sales Income and Research and Development Costs for FDA-Approved Cancer Drugs Sold by Originator Drug Companies.原研药公司销售的美国 FDA 批准的抗癌药物的销售收入和研发成本比较。

JAMA Netw Open. 2019 Jan 4;2(1):e186875. doi: 10.1001/jamanetworkopen.2018.6875.

Consolidation therapy with the combination of bortezomib and lenalidomide (VR) without dexamethasone in multiple myeloma patients after transplant: Effects on survival and bone outcomes in the absence of bisphosphonates.硼替佐米和来那度胺（VR）联合无地塞米松巩固治疗多发性骨髓瘤患者移植后：无双膦酸盐情况下对生存和骨骼结局的影响。

Am J Hematol. 2019 Apr;94(4):400-407. doi: 10.1002/ajh.25392. Epub 2019 Jan 10.

iTextMine: integrated text-mining system for large-scale knowledge extraction from the literature.iTextMine：用于从文献中大规模知识提取的集成文本挖掘系统。

Database (Oxford). 2018 Jan 1;2018:bay128. doi: 10.1093/database/bay128.

PPICurator: A Tool for Extracting Comprehensive Protein-Protein Interaction Information.PPICurator：一种提取全面蛋白质-蛋白质相互作用信息的工具。

Proteomics. 2019 Feb;19(4):e1800291. doi: 10.1002/pmic.201800291. Epub 2019 Jan 7.

PubChem 2019 update: improved access to chemical data.PubChem 2019 年更新：改善化学数据获取。

Nucleic Acids Res. 2019 Jan 8;47(D1):D1102-D1109. doi: 10.1093/nar/gky1033.

A comparison of word embeddings for the biomedical natural language processing.生物医学自然语言处理中词嵌入的比较。

J Biomed Inform. 2018 Nov;87:12-20. doi: 10.1016/j.jbi.2018.09.008. Epub 2018 Sep 12.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

大规模文献挖掘评估抗癌药物与癌症类型的关系。

Large-scale literature mining to assess the relation between anti-cancer drugs and cancer types.

机构信息

出版信息

BACKGROUND

METHODS

RESULTS

CONCLUSIONS

背景

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献