Suppr超能文献

比率分布的贝塔近似及其在下一代测序读段计数中的应用。

Beta approximation of ratio distribution and its application to next generation sequencing read counts.

作者信息

Yang Shengping, Fang Zhide

机构信息

Department of Pathology, School of Medicine, Texas Tech University Health Science Center, Lubbock, Texas, USA.

Biostatistics Program, School of Public Health, LSU Health Sciences Center, New Orleans, Louisiana, USA.

出版信息

J Appl Stat. 2017;44(1):57-70. doi: 10.1080/02664763.2016.1158798. Epub 2016 Mar 16.

Abstract

Paired sequencing data are commonly collected in genomic studies to control biological variation. However, existing data processing strategies suffer at low coverage regions, which are unavoidable due to the limitation of current sequencing technology. Furthermore, information contained in the absolute values of the read counts is commonly ignored. We propose a read count ratio processing/modification method, to not only incorporate information contained in the absolute values of paired counts into one variable, but also mitigate the discrete artifact, especially when both counts are small. Simulation shows that the processed variable fits well with a Beta distribution, thus providing an easy tool for down-stream inference analysis.

摘要

在基因组研究中,通常会收集配对测序数据以控制生物学变异。然而,现有的数据处理策略在低覆盖区域存在问题,由于当前测序技术的局限性,这些区域是不可避免的。此外,读取计数绝对值中包含的信息通常被忽略。我们提出了一种读取计数比率处理/修正方法,不仅将配对计数绝对值中包含的信息整合到一个变量中,还能减轻离散伪影,特别是当两个计数都很小时。模拟表明,处理后的变量与贝塔分布拟合良好,从而为下游推断分析提供了一个简单的工具。

相似文献

1
Beta approximation of ratio distribution and its application to next generation sequencing read counts.
J Appl Stat. 2017;44(1):57-70. doi: 10.1080/02664763.2016.1158798. Epub 2016 Mar 16.
2
Genomic variations in plasma cell free DNA differentiate early stage lung cancers from normal controls.
Lung Cancer. 2015 Oct;90(1):78-84. doi: 10.1016/j.lungcan.2015.07.002. Epub 2015 Jul 15.
3
Empirical estimation of sequencing error rates using smoothing splines.
BMC Bioinformatics. 2016 Apr 22;17:177. doi: 10.1186/s12859-016-1052-3.
4
Modeling the next generation sequencing read count data for DNA copy number variant study.
Stat Appl Genet Mol Biol. 2015 Aug;14(4):361-74. doi: 10.1515/sagmb-2014-0054.
6
Transmission Disequilibrium Tests Based on Read Counts for Low-Coverage Next-Generation Sequence Data.
Hum Hered. 2015;80(1):36-49. doi: 10.1159/000434645. Epub 2015 Aug 12.
7
NeatFreq: reference-free data reduction and coverage normalization for De Novo sequence assembly.
BMC Bioinformatics. 2014 Nov 19;15(1):357. doi: 10.1186/s12859-014-0357-3.
10
A comparative study of k-spectrum-based error correction methods for next-generation sequencing data analysis.
Hum Genomics. 2016 Jul 25;10 Suppl 2(Suppl 2):20. doi: 10.1186/s40246-016-0068-0.

引用本文的文献

1
Gene transcription changes in a locust model of noise-induced deafness.
J Neurophysiol. 2021 Jun 1;125(6):2264-2278. doi: 10.1152/jn.00119.2021. Epub 2021 May 5.
2
An Integrated Approach for RNA-seq Data Normalization.
Cancer Inform. 2016 Jun 27;15:129-41. doi: 10.4137/CIN.S39781. eCollection 2016.

本文引用的文献

1
WaveCNV: allele-specific copy number alterations in primary tumors and xenograft models from next-generation sequencing.
Bioinformatics. 2014 Mar 15;30(6):768-74. doi: 10.1093/bioinformatics/btt611. Epub 2013 Nov 4.
2
PAIR: paired allelic log-intensity-ratio-based normalization method for SNP-CGH arrays.
Bioinformatics. 2013 Feb 1;29(3):299-307. doi: 10.1093/bioinformatics/bts683. Epub 2012 Nov 29.
5
rSW-seq: algorithm for detection of copy number alterations in deep sequencing data.
BMC Bioinformatics. 2010 Aug 18;11:432. doi: 10.1186/1471-2105-11-432.
6
Next-generation genomics: an integrative approach.
Nat Rev Genet. 2010 Jul;11(7):476-86. doi: 10.1038/nrg2795.
7
The Sequence Alignment/Map format and SAMtools.
Bioinformatics. 2009 Aug 15;25(16):2078-9. doi: 10.1093/bioinformatics/btp352. Epub 2009 Jun 8.
8
CNV-seq, a new method to detect copy number variation using high-throughput sequencing.
BMC Bioinformatics. 2009 Mar 6;10:80. doi: 10.1186/1471-2105-10-80.
9
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome.
Genome Biol. 2009;10(3):R25. doi: 10.1186/gb-2009-10-3-r25. Epub 2009 Mar 4.
10
High-resolution mapping of copy-number alterations with massively parallel sequencing.
Nat Methods. 2009 Jan;6(1):99-103. doi: 10.1038/nmeth.1276. Epub 2008 Nov 30.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验