Suppr超能文献

机器学习促进转化研究:预测痴呆症研究中的专利和临床试验纳入情况。

Machine learning to promote translational research: predicting patent and clinical trial inclusion in dementia research.

作者信息

Beinat Matilda, Beinat Julian, Shoaib Mohammed, Magenti Jorge Gomez

机构信息

Institute of Psychiatry Psychology and Neuroscience, King's College London, SE5 8AB, London, UK.

Independent Researcher, 1071XA, Amsterdam, The Netherlands.

出版信息

Brain Commun. 2024 Jul 25;6(4):fcae230. doi: 10.1093/braincomms/fcae230. eCollection 2024.

Abstract

Projected to impact 1.6 million people in the UK by 2040 and costing £25 billion annually, dementia presents a growing challenge to society. This study, a pioneering effort to predict the translational potential of dementia research using machine learning, hopes to address the slow translation of fundamental discoveries into practical applications despite dementia's significant societal and economic impact. We used the Dimensions database to extract data from 43 091 UK dementia research publications between the years 1990 and 2023, specifically metadata (authors, publication year, etc.), concepts mentioned in the paper and the paper abstract. To prepare the data for machine learning, we applied methods such as one-hot encoding and word embeddings. We trained a CatBoost Classifier to predict whether a publication will be cited in a future patent or clinical trial. We trained several model variations. The model combining metadata, concept and abstract embeddings yielded the highest performance: for patent predictions, an area under the receiver operating characteristic curve of 0.84 and 77.17% accuracy; for clinical trial predictions, an area under the receiver operating characteristic curve of 0.81 and 75.11% accuracy. The results demonstrate that integrating machine learning within current research methodologies can uncover overlooked publications, expediting the identification of promising research and potentially transforming dementia research by predicting real-world impact and guiding translational strategies.

摘要

预计到2040年,痴呆症将影响英国160万人,每年花费250亿英镑,这给社会带来了日益严峻的挑战。这项研究是利用机器学习预测痴呆症研究转化潜力的开创性尝试,尽管痴呆症具有重大的社会和经济影响,但基础研究成果转化为实际应用的过程缓慢,该研究希望解决这一问题。我们使用Dimensions数据库从1990年至2023年间的43091篇英国痴呆症研究出版物中提取数据,特别是元数据(作者、出版年份等)、论文中提及的概念以及论文摘要。为了准备用于机器学习的数据,我们应用了独热编码和词嵌入等方法。我们训练了一个CatBoost分类器来预测一篇出版物未来是否会在专利或临床试验中被引用。我们训练了几种模型变体。结合元数据、概念和摘要嵌入的模型性能最高:对于专利预测,受试者工作特征曲线下面积为0.84,准确率为77.17%;对于临床试验预测,受试者工作特征曲线下面积为0.81,准确率为75.11%。结果表明,将机器学习整合到当前的研究方法中,可以发现被忽视的出版物,加快识别有前景的研究,并通过预测实际影响和指导转化策略,有可能改变痴呆症研究。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9951/11269431/d9e908ef263f/fcae230_ga.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验