Suppr超能文献

CrossLink:一种用于癌症亚型跨条件分类的新方法。

CrossLink: a novel method for cross-condition classification of cancer subtypes.

作者信息

Ma Chifeng, Sastry Konduru S, Flore Mario, Gehani Salah, Al-Bozom Issam, Feng Yusheng, Serpedin Erchin, Chouchane Lotfi, Chen Yidong, Huang Yufei

机构信息

Department of Electrical and Computer Engineering, University of Texas at San Antonio, San Antonio, TX, USA.

Weill Cornell Medicine-Qatar, Doha, Qatar.

出版信息

BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):549. doi: 10.1186/s12864-016-2903-z.

Abstract

BACKGROUND

We considered the prediction of cancer classes (e.g. subtypes) using patient gene expression profiles that contain both systematic and condition-specific biases when compared with the training reference dataset. The conventional normalization-based approaches cannot guarantee that the gene signatures in the reference and prediction datasets always have the same distribution for all different conditions as the class-specific gene signatures change with the condition. Therefore, the trained classifier would work well under one condition but not under another.

METHODS

To address the problem of current normalization approaches, we propose a novel algorithm called CrossLink (CL). CL recognizes that there is no universal, condition-independent normalization mapping of signatures. In contrast, it exploits the fact that the signature is unique to its associated class under any condition and thus employs an unsupervised clustering algorithm to discover this unique signature.

RESULTS

We assessed the performance of CL for cross-condition predictions of PAM50 subtypes of breast cancer by using a simulated dataset modeled after TCGA BRCA tumor samples with a cross-validation scheme, and datasets with known and unknown PAM50 classification. CL achieved prediction accuracy >73 %, highest among other methods we evaluated. We also applied the algorithm to a set of breast cancer tumors derived from Arabic population to assign a PAM50 classification to each tumor based on their gene expression profiles.

CONCLUSIONS

A novel algorithm CrossLink for cross-condition prediction of cancer classes was proposed. In all test datasets, CL showed robust and consistent improvement in prediction performance over other state-of-the-art normalization and classification algorithms.

摘要

背景

我们考虑使用患者基因表达谱来预测癌症类别(例如亚型),与训练参考数据集相比,该基因表达谱包含系统偏差和特定条件偏差。传统的基于归一化的方法不能保证参考数据集和预测数据集中的基因特征在所有不同条件下都具有相同的分布,因为特定类别的基因特征会随条件变化。因此,训练好的分类器在一种条件下效果良好,而在另一种条件下则不然。

方法

为了解决当前归一化方法的问题,我们提出了一种名为CrossLink(CL)的新算法。CL认识到不存在通用的、与条件无关的特征归一化映射。相反,它利用了这样一个事实,即在任何条件下,特征与其相关类别都是唯一的,因此采用无监督聚类算法来发现这种唯一特征。

结果

我们通过使用一个以TCGA BRCA肿瘤样本为模型的模拟数据集和交叉验证方案,以及具有已知和未知PAM50分类的数据集,评估了CL对乳腺癌PAM50亚型进行跨条件预测的性能。CL实现了>73%的预测准确率,在我们评估的其他方法中是最高的。我们还将该算法应用于一组来自阿拉伯人群的乳腺癌肿瘤,根据其基因表达谱为每个肿瘤分配PAM50分类。

结论

提出了一种用于癌症类别跨条件预测的新算法CrossLink。在所有测试数据集中,CL在预测性能上比其他最先进的归一化和分类算法表现出稳健且一致的提升。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba61/5001207/e8b529e9f001/12864_2016_2903_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验