Suppr超能文献

Super.FELT:基于三重损失的监督特征提取学习在多组学数据药物反应预测中的应用。

Super.FELT: supervised feature extraction learning using triplet loss for drug response prediction with multi-omics data.

机构信息

School of Electrical Engineering and Computer Science, Gwangju Institute of Science and Technology, Gwangju, South Korea.

Graduate School of Artificial Intelligence, Gwangju Institute of Science and Technology, Gwangju, South Korea.

出版信息

BMC Bioinformatics. 2021 May 25;22(1):269. doi: 10.1186/s12859-021-04146-z.

Abstract

BACKGROUND

Predicting the drug response of a patient is important for precision oncology. In recent studies, multi-omics data have been used to improve the prediction accuracy of drug response. Although multi-omics data are good resources for drug response prediction, the large dimension of data tends to hinder performance improvement. In this study, we aimed to develop a new method, which can effectively reduce the large dimension of data, based on the supervised deep learning model for predicting drug response.

RESULTS

We proposed a novel method called Supervised Feature Extraction Learning using Triplet loss (Super.FELT) for drug response prediction. Super.FELT consists of three stages, namely, feature selection, feature encoding using a supervised method, and binary classification of drug response (sensitive or resistant). We used multi-omics data including mutation, copy number aberration, and gene expression, and these were obtained from cell lines [Genomics of Drug Sensitivity in Cancer (GDSC), Cancer Cell Line Encyclopedia (CCLE), and Cancer Therapeutics Response Portal (CTRP)], patient-derived tumor xenografts (PDX), and The Cancer Genome Atlas (TCGA). GDSC was used for training and cross-validation tests, and CCLE, CTRP, PDX, and TCGA were used for external validation. We performed ablation studies for the three stages and verified that the use of multi-omics data guarantees better performance of drug response prediction. Our results verified that Super.FELT outperformed the other methods at external validation on PDX and TCGA and was good at cross-validation on GDSC and external validation on CCLE and CTRP. In addition, through our experiments, we confirmed that using multi-omics data is useful for external non-cell line data.

CONCLUSION

By separating the three stages, Super.FELT achieved better performance than the other methods. Through our results, we found that it is important to train encoders and a classifier independently, especially for external test on PDX and TCGA. Moreover, although gene expression is the most powerful data on cell line data, multi-omics promises better performance for external validation on non-cell line data than gene expression data. Source codes of Super.FELT are available at  https://github.com/DMCB-GIST/Super.FELT .

摘要

背景

预测患者的药物反应对于精准肿瘤学至关重要。在最近的研究中,多组学数据已被用于提高药物反应预测的准确性。尽管多组学数据是药物反应预测的良好资源,但数据的大维度往往会阻碍性能的提高。在这项研究中,我们旨在开发一种新的方法,可以基于监督深度学习模型有效地减少数据的大维度,用于药物反应预测。

结果

我们提出了一种新的方法,称为基于三重损失的监督特征提取学习(Supervised Feature Extraction Learning using Triplet loss,Super.FELT),用于药物反应预测。Super.FELT 由三个阶段组成,即特征选择、使用监督方法进行特征编码和药物反应的二进制分类(敏感或耐药)。我们使用了包括突变、拷贝数异常和基因表达在内的多组学数据,这些数据来自细胞系(癌症基因组药物敏感性(GDSC)、癌症细胞系百科全书(CCLE)和癌症治疗反应门户(CTRP))、患者来源的肿瘤异种移植物(PDX)和癌症基因组图谱(TCGA)。GDSC 用于训练和交叉验证测试,CCLE、CTRP、PDX 和 TCGA 用于外部验证。我们对三个阶段进行了消融研究,并验证了使用多组学数据可以保证更好的药物反应预测性能。我们的结果验证了 Super.FELT 在 PDX 和 TCGA 的外部验证以及 GDSC 的交叉验证和 CCLE 和 CTRP 的外部验证上均优于其他方法。此外,通过我们的实验,我们证实了使用多组学数据对外部非细胞系数据是有用的。

结论

通过分离三个阶段,Super.FELT 实现了比其他方法更好的性能。通过我们的结果,我们发现独立训练编码器和分类器非常重要,特别是对于 PDX 和 TCGA 的外部测试。此外,尽管基因表达在细胞系数据上是最强大的数据,但多组学数据在外部验证非细胞系数据时比基因表达数据具有更好的性能。Super.FELT 的源代码可在 https://github.com/DMCB-GIST/Super.FELT 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1696/8152321/dc85853f88f0/12859_2021_4146_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验