Zou Jinfeng, Wang Edwin
National Research Council Canada, Montreal, QC H4P 2R2, Canada.
National Research Council Canada, Montreal, QC H4P 2R2, Canada; Department of Experimental Medicine, McGill University, Montreal, QC H3A 2B2, Canada; Center for Bioinformatics, McGill University, Montreal, QC H3G 0B1, Canada; Center for Health Genomics and Informatics, University of Calgary Cumming School of Medicine, Calgary, AB T2N 4N1, Canada; Department of Biochemistry & Molecular Biology, University of Calgary Cumming School of Medicine, Calgary, AB T2N 4N1, Canada; Department of Medical Genetics, University of Calgary Cumming School of Medicine, Calgary, AB T2N 4N1, Canada; Department of Oncology, University of Calgary Cumming School of Medicine, Calgary, AB T2N 4N1, Canada; Alberta Children's Hospital Research Institute, Calgary, AB T2N 4N1, Canada; Arnie Charbonneau Cancer Research Institute, Calgary, AB T2N 4N1, Canada; O'Brien Institute for Public Health, Calgary, AB T2N 4N1, Canada.
Genomics Proteomics Bioinformatics. 2017 Apr;15(2):130-140. doi: 10.1016/j.gpb.2017.01.004. Epub 2017 Apr 4.
With the technology development on detecting circulating tumor cells (CTCs) and cell-free DNAs (cfDNAs) in blood, serum, and plasma, non-invasive diagnosis of cancer becomes promising. A few studies reported good correlations between signals from tumor tissues and CTCs or cfDNAs, making it possible to detect cancers using CTCs and cfDNAs. However, the detection cannot tell which cancer types the person has. To meet these challenges, we developed an algorithm, eTumorType, to identify cancer types based on copy number variations (CNVs) of the cancer founding clone. eTumorType integrates cancer hallmark concepts and a few computational techniques such as stochastic gradient boosting, voting, centroid, and leading patterns. eTumorType has been trained and validated on a large dataset including 18 common cancer types and 5327 tumor samples. eTumorType produced high accuracies (0.86-0.96) and high recall rates (0.79-0.92) for predicting colon, brain, prostate, and kidney cancers. In addition, relatively high accuracies (0.78-0.92) and recall rates (0.58-0.95) have also been achieved for predicting ovarian, breast luminal, lung, endometrial, stomach, head and neck, leukemia, and skin cancers. These results suggest that eTumorType could be used for non-invasive diagnosis to determine cancer types based on CNVs of CTCs and cfDNAs.
随着在血液、血清和血浆中检测循环肿瘤细胞(CTC)和游离DNA(cfDNA)的技术发展,癌症的非侵入性诊断变得很有前景。一些研究报告了肿瘤组织与CTC或cfDNA信号之间的良好相关性,这使得利用CTC和cfDNA检测癌症成为可能。然而,这种检测无法判断患者患有哪种癌症类型。为了应对这些挑战,我们开发了一种算法eTumorType,用于根据癌症起始克隆的拷贝数变异(CNV)来识别癌症类型。eTumorType整合了癌症特征概念以及一些计算技术,如随机梯度提升、投票、质心和主导模式。eTumorType已在一个包含18种常见癌症类型和5327个肿瘤样本的大型数据集上进行了训练和验证。对于预测结肠癌、脑癌、前列腺癌和肾癌,eTumorType产生了较高的准确率(0.86 - 0.96)和召回率(0.79 - 0.92)。此外,对于预测卵巢癌、乳腺管腔癌、肺癌、子宫内膜癌、胃癌、头颈癌、白血病和皮肤癌,也实现了相对较高的准确率(0.78 - 0.92)和召回率(0.58 - 0.95)。这些结果表明,eTumorType可用于基于CTC和cfDNA的CNV进行非侵入性诊断以确定癌症类型。