Suppr超能文献

通过挖掘高通量基因组学数据识别异质性癌症中药物反应的候选驱动因素。

Identifying candidate drivers of drug response in heterogeneous cancer by mining high throughput genomics data.

作者信息

Nabavi Sheida

机构信息

Computer Science and Engineering Department, Institute for Systems Genomics, University of Connecticut, 371 Fairfield Way, Unit 4155, Storrs, CT, 06268, USA.

出版信息

BMC Genomics. 2016 Aug 15;17(1):638. doi: 10.1186/s12864-016-2942-5.

Abstract

BACKGROUND

With advances in technologies, huge amounts of multiple types of high-throughput genomics data are available. These data have tremendous potential to identify new and clinically valuable biomarkers to guide the diagnosis, assessment of prognosis, and treatment of complex diseases, such as cancer. Integrating, analyzing, and interpreting big and noisy genomics data to obtain biologically meaningful results, however, remains highly challenging. Mining genomics datasets by utilizing advanced computational methods can help to address these issues.

RESULTS

To facilitate the identification of a short list of biologically meaningful genes as candidate drivers of anti-cancer drug resistance from an enormous amount of heterogeneous data, we employed statistical machine-learning techniques and integrated genomics datasets. We developed a computational method that integrates gene expression, somatic mutation, and copy number aberration data of sensitive and resistant tumors. In this method, an integrative method based on module network analysis is applied to identify potential driver genes. This is followed by cross-validation and a comparison of the results of sensitive and resistance groups to obtain the final list of candidate biomarkers. We applied this method to the ovarian cancer data from the cancer genome atlas. The final result contains biologically relevant genes, such as COL11A1, which has been reported as a cis-platinum resistant biomarker for epithelial ovarian carcinoma in several recent studies.

CONCLUSIONS

The described method yields a short list of aberrant genes that also control the expression of their co-regulated genes. The results suggest that the unbiased data driven computational method can identify biologically relevant candidate biomarkers. It can be utilized in a wide range of applications that compare two conditions with highly heterogeneous datasets.

摘要

背景

随着技术的进步,可获得大量多种类型的高通量基因组学数据。这些数据在识别新的具有临床价值的生物标志物以指导复杂疾病(如癌症)的诊断、预后评估和治疗方面具有巨大潜力。然而,整合、分析和解释庞大且有噪声的基因组学数据以获得具有生物学意义的结果仍然极具挑战性。利用先进的计算方法挖掘基因组学数据集有助于解决这些问题。

结果

为了便于从大量异质数据中识别出一小部分具有生物学意义的基因作为抗癌药物耐药性的候选驱动因素,我们采用了统计机器学习技术并整合了基因组学数据集。我们开发了一种计算方法,该方法整合了敏感和耐药肿瘤的基因表达、体细胞突变和拷贝数变异数据。在这种方法中,基于模块网络分析的整合方法被应用于识别潜在的驱动基因。随后进行交叉验证,并比较敏感组和耐药组的结果以获得候选生物标志物的最终列表。我们将此方法应用于癌症基因组图谱中的卵巢癌数据。最终结果包含具有生物学相关性的基因,如COL11A1,在最近的几项研究中它已被报道为上皮性卵巢癌的顺铂耐药生物标志物。

结论

所描述的方法产生了一份异常基因的简短列表,这些基因还控制其共调控基因的表达。结果表明,这种无偏数据驱动的计算方法可以识别具有生物学相关性的候选生物标志物。它可用于广泛的应用中,用于比较具有高度异质数据集的两种情况。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e7f0/4986197/6352b20dcf6b/12864_2016_2942_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验