Suppr超能文献

利用纵向下一代测序数据跟踪癌症中克隆动态的统计方法。

A statistical approach for tracking clonal dynamics in cancer using longitudinal next-generation sequencing data.

机构信息

Department of Oncology, University of Oxford, Oxford, OX3 7DQ, UK.

Nuffield Department of Medicine, Wellcome Centre for Human Genetics, University of Oxford, Oxford, OX3 7BN, UK.

出版信息

Bioinformatics. 2021 Apr 19;37(2):147-154. doi: 10.1093/bioinformatics/btaa672.

Abstract

MOTIVATION

Tumours are composed of distinct cancer cell populations (clones), which continuously adapt to their local micro-environment. Standard methods for clonal deconvolution seek to identify groups of mutations and estimate the prevalence of each group in the tumour, while considering its purity and copy number profile. These methods have been applied on cross-sectional data and on longitudinal data after discarding information on the timing of sample collection. Two key questions are how can we incorporate such information in our analyses and is there any benefit in doing so?

RESULTS

We developed a clonal deconvolution method, which incorporates explicitly the temporal spacing of longitudinally sampled tumours. By merging a Dirichlet Process Mixture Model with Gaussian Process priors and using as input a sequence of several sparsely collected samples, our method can reconstruct the temporal profile of the abundance of any mutation cluster supported by the data as a continuous function of time. We benchmarked our method on whole genome, whole exome and targeted sequencing data from patients with chronic lymphocytic leukaemia, on liquid biopsy data from a patient with melanoma and on synthetic data and we found that incorporating information on the timing of tissue collection improves model performance, as long as data of sufficient volume and complexity are available for estimating free model parameters. Thus, our approach is particularly useful when collecting a relatively long sequence of tumour samples is feasible, as in liquid cancers (e.g. leukaemia) and liquid biopsies.

AVAILABILITY AND IMPLEMENTATION

The statistical methodology presented in this paper is freely available at github.com/dvav/clonosGP.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

肿瘤由不同的癌细胞群体(克隆)组成,这些群体不断适应其局部微环境。克隆分解的标准方法旨在识别突变群,并估计每个群体在肿瘤中的流行率,同时考虑其纯度和拷贝数谱。这些方法已经应用于横截面数据和纵向数据,在丢弃关于样本采集时间的信息后。有两个关键问题是我们如何在分析中纳入这些信息,以及这样做是否有任何好处?

结果

我们开发了一种克隆分解方法,该方法明确纳入了纵向采样肿瘤的时间间隔。通过将狄利克雷过程混合模型与高斯过程先验合并,并将一系列稀疏采集的样本作为输入,我们的方法可以将数据支持的任何突变簇的丰度的时间分布作为时间的连续函数进行重建。我们在慢性淋巴细胞白血病患者的全基因组、全外显子和靶向测序数据、黑色素瘤患者的液体活检数据以及合成数据上对我们的方法进行了基准测试,我们发现,只要有足够数量和复杂性的数据可用于估计自由模型参数,纳入组织采集时间的信息可以提高模型性能。因此,我们的方法在可行的情况下收集相对较长的肿瘤样本序列时特别有用,例如在液体癌(如白血病)和液体活检中。

可用性和实现

本文介绍的统计方法在 github.com/dvav/clonosGP 上免费提供。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d136/8055230/8c3bbe81f2f4/btaa672f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验