Suppr超能文献

一个基于网络的用于分析液体活检数据的自动化机器学习平台。

A web-based automated machine learning platform to analyze liquid biopsy data.

作者信息

Shen Hanfei, Liu Tony, Cui Jesse, Borole Piyush, Benjamin Ari, Kording Konrad, Issadore David

机构信息

Department of Bioengineering, University of Pennsylvania, Philadelphia, PA 19104, USA.

出版信息

Lab Chip. 2020 Jun 21;20(12):2166-2174. doi: 10.1039/d0lc00096e. Epub 2020 May 18.

Abstract

Liquid biopsy (LB) technologies continue to improve in sensitivity, specificity, and multiplexing and can measure an ever growing library of disease biomarkers. However, clinical interpretation of the increasingly large sets of data these technologies generate remains a challenge. Machine learning is a popular approach to discover and detect signatures of disease. However, limited machine learning expertise in the LB field has kept the discipline from fully leveraging these tools and risks improper analyses and irreproducible results. In this paper, we develop a web-based automated machine learning tool tailored specifically for LB, where machine learning models can be built without the user's input. We also incorporate a differential privacy algorithm, designed to limit the effects of overfitting that can arise from users iteratively developing a panel with feedback from our platform. We validate our approach by performing a meta-analysis on 11 published LB datasets, and found that we had similar or better performance compared to those reported in the literature. Moreover, we show that our platform's performance improved when incorporating information from prior LB datasets, suggesting that this approach can continue to improve with increased access to LB data. Finally, we show that by using our platform the results achieved in the literature can be matched using 40% of the number of subjects in the training set, potentially reducing study cost and time. This self-improving and overfitting-resistant automatic machine learning platform provides a new standard that can be used to validate machine learning works in the LB field.

摘要

液体活检(LB)技术在灵敏度、特异性和多重检测方面不断改进,能够检测越来越多的疾病生物标志物库。然而,对这些技术所产生的日益庞大的数据集进行临床解读仍然是一项挑战。机器学习是发现和检测疾病特征的常用方法。然而,LB领域有限的机器学习专业知识阻碍了该学科充分利用这些工具,存在分析不当和结果不可重复的风险。在本文中,我们开发了一种专门为LB量身定制的基于网络的自动化机器学习工具,无需用户输入即可构建机器学习模型。我们还纳入了一种差分隐私算法,旨在限制因用户根据我们平台的反馈迭代开发一个检测组而可能出现的过拟合影响。我们通过对11个已发表的LB数据集进行荟萃分析来验证我们的方法,发现我们的性能与文献报道的相似或更好。此外,我们表明,纳入先前LB数据集的信息时,我们平台的性能有所提高,这表明随着获取LB数据的增加,这种方法可以持续改进。最后,我们表明,使用我们的平台,在训练集中使用40%的受试者数量就能达到文献中的结果,这有可能降低研究成本和时间。这个自我改进且抗过拟合的自动机器学习平台提供了一个可用于验证LB领域机器学习工作的新标准。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验