Suppr超能文献

基于下一代测序数据的拷贝数改变谱对肿瘤亚型进行分层。

Stratifying tumour subtypes based on copy number alteration profiles using next-generation sequence data.

机构信息

Department of Statistics, University of Leeds, Leeds, LS2 9JT, UK.

Department of Thoracic Surgery, St. James Hospital, Leeds, LS9 7TF, UK.

出版信息

Bioinformatics. 2015 Aug 15;31(16):2713-20. doi: 10.1093/bioinformatics/btv191. Epub 2015 Apr 5.

Abstract

MOTIVATION

The role of personalized medicine and target treatment in the clinical management of cancer patients has become increasingly important in recent years. This has made the task of precise histological substratification of cancers crucial. Increasingly, genomic data are being seen as a valuable classifier. Specifically, copy number alteration (CNA) profiles generated by next-generation sequencing (NGS) can become a determinant for tumours subtyping. The principle purpose of this study is to devise a model with good prediction capability for the tumours histological subtypes as a function of both the patients covariates and their genome-wide CNA profiles from NGS data.

RESULTS

We investigate a logistic regression for modelling tumour histological subtypes as a function of the patients' covariates and their CNA profiles, in a mixed model framework. The covariates, such as age and gender, are considered as fixed predictors and the genome-wide CNA profiles are considered as random predictors. We illustrate the application of this model in lung and oral cancer datasets, and the results indicate that the tumour histological subtypes can be modelled with a good fit. Our cross-validation indicates that the logistic regression exhibits the best prediction relative to other classification methods we considered in this study. The model also exhibits the best agreement in the prediction between smooth-segmented and circular binary-segmented CNA profiles.

AVAILABILITY AND IMPLEMENTATION

An R package to run a logistic regression is available in http://www1.maths.leeds.ac.uk/~arief/R/CNALR/.

CONTACT

a.gusnanto@leeds.ac.uk

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

近年来,个性化医学和靶向治疗在癌症患者的临床管理中的作用变得越来越重要。这使得对癌症进行精确的组织学细分变得至关重要。越来越多的基因组数据被视为有价值的分类器。具体来说,下一代测序(NGS)产生的拷贝数改变(CNA)谱可以成为肿瘤亚分型的决定因素。本研究的主要目的是设计一种具有良好预测能力的模型,将肿瘤的组织学亚型作为患者协变量及其来自 NGS 数据的全基因组 CNA 谱的函数。

结果

我们在混合模型框架中研究了一种逻辑回归模型,用于将肿瘤的组织学亚型作为患者协变量及其 CNA 谱的函数进行建模。协变量,如年龄和性别,被视为固定预测因子,而全基因组 CNA 谱被视为随机预测因子。我们在肺癌和口腔癌数据集上说明了该模型的应用,结果表明可以很好地拟合肿瘤的组织学亚型。我们的交叉验证表明,与我们在这项研究中考虑的其他分类方法相比,逻辑回归表现出最佳的预测效果。该模型在平滑分段和圆形二分分段 CNA 谱的预测之间也表现出最佳的一致性。

可用性和实现

可在 http://www1.maths.leeds.ac.uk/~arief/R/CNALR/ 上运行逻辑回归的 R 包。

联系人

a.gusnanto@leeds.ac.uk

补充信息

补充数据可在《生物信息学》在线获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验