文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

一种用于重建基于亚克隆群体的体细胞拷贝数变异的下一代测序数据的流程。

A Pipeline for Reconstructing Somatic Copy Number Alternation's Subclonal Population-Based Next-Generation Sequencing Data.

作者信息

Chu Yanshuo, Nie Chenxi, Wang Yadong

机构信息

Center of Bioinfomatics, School of Computer Science and Technology, Harbin Institute of Technology, Harbin, China.

出版信息

Front Genet. 2020 Feb 27;10:1374. doi: 10.3389/fgene.2019.01374. eCollection 2019.


DOI:10.3389/fgene.2019.01374
PMID:32180789
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7058119/
Abstract

State-of-the-art next-generation sequencing (NGS)-based subclonal reconstruction methods perform poorly on somatic copy number alternations (SCNAs), due to not only it needs to simultaneously estimate the subclonal population frequency and the absolute copy number for each SCNA, but also there exist complex bias and noise in the tumor and its paired normal sequencing data. Both existing NGS-based SCNA detection methods and SCNA's subclonal population frequency inferring tools use the read count on radio (RCR) of tumor to its paired normal as the key feature of tumor sequencing data; however, the sequencing error and bias have great impact on RCR, which leads to a large number of redundant SCNA segments that make the subsequent process of SCNA's subclonal population frequency inferring and subclonal reconstruction time-consuming and inaccurate. We perform a mathematical analysis of the solution number of SCNA's subclonal frequency, and we propose a computational algorithm to reduce the impact of false breakpoints based on it. We construct a new probability model that incorporates the RCR bias correction algorithm, and by stringing it with the false breakpoint filtering algorithm, we construct a whole SCNA's subclonal population reconstruction pipeline. The experimental result shows that our pipeline outperforms the existing subclonal reconstruction programs both on simulated data and TCGA data. Source code is publicly available as a Python package at https://github.com/dustincys/msphy-SCNAClonal.

摘要

基于最先进的下一代测序(NGS)的亚克隆重建方法在体细胞拷贝数改变(SCNA)方面表现不佳,这不仅是因为它需要同时估计每个SCNA的亚克隆群体频率和绝对拷贝数,还因为在肿瘤及其配对的正常测序数据中存在复杂的偏差和噪声。现有的基于NGS的SCNA检测方法和SCNA的亚克隆群体频率推断工具都将肿瘤与其配对正常样本的读数计数比(RCR)作为肿瘤测序数据的关键特征;然而,测序错误和偏差对RCR有很大影响,这导致大量冗余的SCNA片段,使得后续的SCNA亚克隆群体频率推断和亚克隆重建过程既耗时又不准确。我们对SCNA亚克隆频率的解的数量进行了数学分析,并在此基础上提出了一种计算算法来减少错误断点的影响。我们构建了一个包含RCR偏差校正算法的新概率模型,并将其与错误断点过滤算法串联起来,构建了一个完整的SCNA亚克隆群体重建流程。实验结果表明,我们的流程在模拟数据和TCGA数据上均优于现有的亚克隆重建程序。源代码作为一个Python包在https://github.com/dustincys/msphy-SCNAClonal上公开可用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/be00ee309231/fgene-10-01374-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/33db45f20f19/fgene-10-01374-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/5c27f76912d9/fgene-10-01374-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/01a31a5192a4/fgene-10-01374-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/315f281cc503/fgene-10-01374-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/58401f5e8387/fgene-10-01374-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/825d33f49cf8/fgene-10-01374-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/be00ee309231/fgene-10-01374-g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/33db45f20f19/fgene-10-01374-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/5c27f76912d9/fgene-10-01374-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/01a31a5192a4/fgene-10-01374-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/315f281cc503/fgene-10-01374-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/58401f5e8387/fgene-10-01374-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/825d33f49cf8/fgene-10-01374-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/89a7/7058119/be00ee309231/fgene-10-01374-g007.jpg

相似文献

[1]
A Pipeline for Reconstructing Somatic Copy Number Alternation's Subclonal Population-Based Next-Generation Sequencing Data.

Front Genet. 2020-2-27

[2]
Modeling and correct the GC bias of tumor and normal WGS data for SCNA based tumor subclonal population inferring.

BMC Bioinformatics. 2018-4-11

[3]
MixClone: a mixture model for inferring tumor subclonal populations.

BMC Genomics. 2015

[4]
Comprehensive analysis of intratumoural heterogeneity of somatic copy number alterations in diffuse glioma reveals clonality-dependent prognostic patterns.

Neuropathol Appl Neurobiol. 2022-10

[5]
CLImAT-HET: detecting subclonal copy number alterations and loss of heterozygosity in heterogeneous tumor samples from whole-genome sequencing data.

BMC Med Genomics. 2017-3-15

[6]
PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors.

Genome Biol. 2015-2-13

[7]
CloneCNA: detecting subclonal somatic copy number alterations in heterogeneous tumor samples from whole-exome sequencing data.

BMC Bioinformatics. 2016-8-19

[8]
Decomposing the subclonal structure of tumors with two-way mixture models on copy number aberrations.

PLoS One. 2018-12-12

[9]
Crowd-sourced benchmarking of single-sample tumor subclonal reconstruction.

Nat Biotechnol. 2025-4

[10]
TargetClone: A multi-sample approach for reconstructing subclonal evolution of tumors.

PLoS One. 2018-11-29

本文引用的文献

[1]
gutMDisorder: a comprehensive database for dysbiosis of the gut microbiota in disorders and interventions.

Nucleic Acids Res. 2020-1-8

[2]
Exposing the Causal Effect of C-Reactive Protein on the Risk of Type 2 Diabetes Mellitus: A Mendelian Randomization Study.

Front Genet. 2018-12-20

[3]
LncRNA2Target v2.0: a comprehensive database for target genes of lncRNAs in human and mouse.

Nucleic Acids Res. 2019-1-8

[4]
DincRNA: a comprehensive web-based bioinformatics toolkit for exploring disease associations and ncRNA function.

Bioinformatics. 2018-6-1

[5]
InfAcrOnt: calculating cross-ontology term similarities using information flow by a random walk.

BMC Genomics. 2018-1-19

[6]
MetSigDis: a manually curated resource for the metabolic signatures of diseases.

Brief Bioinform. 2019-1-18

[7]
OAHG: an integrated resource for annotating human genes with multi-level ontologies.

Sci Rep. 2016-10-5

[8]
PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors.

Genome Biol. 2015-2-13

[9]
MixClone: a mixture model for inferring tumor subclonal populations.

BMC Genomics. 2015

[10]
Inferring clonal evolution of tumors from single nucleotide somatic mutations.

BMC Bioinformatics. 2014-2-1

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索