• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CNARA:基因组拷贝数图谱的可靠性评估

CNARA: reliability assessment for genomic copy number profiles.

作者信息

Ai Ni, Cai Haoyang, Solovan Caius, Baudis Michael

机构信息

Institute of Molecular Life Sciences and Swiss Institute of Bioinformatics, University of Zurich, Winterthurerstrasse 190, Zurich, CH-8057, Switzerland.

Center of Growth, Metabolism and Aging, Key Laboratory of Bio-Resources and Eco-Environment, College of Life Sciences, Sichuan University, Chengdu, Sichuan, 610064, China.

出版信息

BMC Genomics. 2016 Oct 12;17(1):799. doi: 10.1186/s12864-016-3074-7.

DOI:10.1186/s12864-016-3074-7
PMID:27733115
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5062840/
Abstract

BACKGROUND

DNA copy number profiles from microarray and sequencing experiments sometimes contain wave artefacts which may be introduced during sample preparation and cannot be removed completely by existing preprocessing methods. Besides, large derivative log ratio spread (DLRS) of the probes correlating with poor DNA quality is sometimes observed in genome screening experiments and may lead to unreliable copy number profiles. Depending on the extent of these artefacts and the resulting misidentification of copy number alterations/variations (CNA/CNV), it may be desirable to exclude such samples from analyses or to adapt the downstream data analysis strategy accordingly.

RESULTS

Here, we propose a method to distinguish reliable genomic copy number profiles from those containing heavy wave artefacts and/or large DLRS. We define four features that adequately summarize the copy number profiles for reliability assessment, and train a classifier on a dataset of 1522 copy number profiles from various microarray platforms. The method can be applied to predict the reliability of copy number profiles irrespective of the underlying microarray platform and may be adapted for those sequencing platforms from which copy number estimates could be computed as a piecewise constant signal. Further details can be found at https://github.com/baudisgroup/CNARA .

CONCLUSIONS

We have developed a method for the assessment of genomic copy number profiling data, and suggest to apply the method in addition to and after other state-of-the-art noise correction and quality control procedures. CNARA could be instrumental in improving the assessment of data used for genomic data mining experiments and support the reliable functional attribution of copy number aberrations especially in cancer research.

摘要

背景

来自微阵列和测序实验的DNA拷贝数图谱有时会包含波形伪影,这些伪影可能在样品制备过程中引入,并且现有预处理方法无法完全去除。此外,在基因组筛选实验中有时会观察到与DNA质量差相关的探针的大导数对数比 spread(DLRS),这可能导致不可靠的拷贝数图谱。根据这些伪影的程度以及由此导致的拷贝数改变/变异(CNA/CNV)的错误识别,可能需要从分析中排除此类样本或相应地调整下游数据分析策略。

结果

在此,我们提出了一种方法,用于区分可靠的基因组拷贝数图谱与包含严重波形伪影和/或大DLRS的图谱。我们定义了四个特征,这些特征足以总结拷贝数图谱以进行可靠性评估,并在来自各种微阵列平台的1522个拷贝数图谱的数据集上训练了一个分类器。该方法可用于预测拷贝数图谱的可靠性,而与基础微阵列平台无关,并且可适用于那些可以将拷贝数估计计算为分段恒定信号的测序平台。更多详细信息可在https://github.com/baudisgroup/CNARA上找到。

结论

我们开发了一种评估基因组拷贝数分析数据的方法,并建议在其他最新的噪声校正和质量控制程序之后应用该方法。CNARA有助于改进用于基因组数据挖掘实验的数据评估,并支持拷贝数畸变的可靠功能归因,特别是在癌症研究中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9e3/5062840/f9a5da8b8a53/12864_2016_3074_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9e3/5062840/539e3bd4188d/12864_2016_3074_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9e3/5062840/8885cc7563b4/12864_2016_3074_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9e3/5062840/1c1c8ca49375/12864_2016_3074_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9e3/5062840/f9a5da8b8a53/12864_2016_3074_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9e3/5062840/539e3bd4188d/12864_2016_3074_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9e3/5062840/8885cc7563b4/12864_2016_3074_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9e3/5062840/1c1c8ca49375/12864_2016_3074_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b9e3/5062840/f9a5da8b8a53/12864_2016_3074_Fig4_HTML.jpg

相似文献

1
CNARA: reliability assessment for genomic copy number profiles.CNARA:基因组拷贝数图谱的可靠性评估
BMC Genomics. 2016 Oct 12;17(1):799. doi: 10.1186/s12864-016-3074-7.
2
Minimum error calibration and normalization for genomic copy number analysis.基因组拷贝数分析的最小误差校准和归一化。
Genomics. 2020 Sep;112(5):3331-3341. doi: 10.1016/j.ygeno.2020.05.008. Epub 2020 May 13.
3
Preprocessing and downstream analysis of microarray DNA copy number profiles.微阵列 DNA 拷贝数谱的预处理和下游分析。
Brief Bioinform. 2011 Jan;12(1):10-21. doi: 10.1093/bib/bbq004. Epub 2010 Feb 19.
4
Assessment of megabase-scale somatic copy number variation using single-cell sequencing.使用单细胞测序评估兆碱基规模的体细胞拷贝数变异
Genome Res. 2016 Mar;26(3):376-84. doi: 10.1101/gr.198937.115. Epub 2016 Jan 15.
5
Enhanced copy number variants detection from whole-exome sequencing data using EXCAVATOR2.使用EXCAVATOR2从全外显子组测序数据中增强拷贝数变异检测。
Nucleic Acids Res. 2016 Nov 16;44(20):e154. doi: 10.1093/nar/gkw695. Epub 2016 Aug 9.
6
Comparative analysis of methods for identifying somatic copy number alterations from deep sequencing data.从深度测序数据中识别体细胞拷贝数改变的方法的比较分析。
Brief Bioinform. 2015 Mar;16(2):242-54. doi: 10.1093/bib/bbu004. Epub 2014 Mar 5.
7
Xome-Blender: A novel cancer genome simulator.Xome-Blender:一种新型癌症基因组模拟器。
PLoS One. 2018 Apr 5;13(4):e0194472. doi: 10.1371/journal.pone.0194472. eCollection 2018.
8
A survey of copy-number variation detection tools based on high-throughput sequencing data.基于高通量测序数据的拷贝数变异检测工具综述。
Curr Protoc Hum Genet. 2012 Oct;Chapter 7:Unit7.19. doi: 10.1002/0471142905.hg0719s75.
9
Modeling the DNA copy number aberration patterns in observational high-throughput cancer data.在观察性高通量癌症数据中模拟DNA拷贝数畸变模式。
Stat Appl Genet Mol Biol. 2013 Apr 19;12(2):143-74. doi: 10.1515/sagmb-2012-0020.
10
Resolving complex structural genomic rearrangements using a randomized approach.使用随机方法解析复杂的结构基因组重排。
Genome Biol. 2016 Jun 10;17(1):126. doi: 10.1186/s13059-016-0993-1.

引用本文的文献

1
The Progenetix oncogenomic resource in 2021.2021 年的 Progenetix 肿瘤基因组资源。
Database (Oxford). 2021 Jul 17;2021. doi: 10.1093/database/baab043.
2
Hierarchical discovery of large-scale and focal copy number alterations in low-coverage cancer genomes.在低覆盖度癌症基因组中进行大规模和焦点拷贝数改变的层次式发现。
BMC Bioinformatics. 2020 Apr 16;21(1):147. doi: 10.1186/s12859-020-3480-3.

本文引用的文献

1
CopywriteR: DNA copy number detection from off-target sequence data.CopywriteR:从脱靶序列数据中检测DNA拷贝数
Genome Biol. 2015 Feb 27;16(1):49. doi: 10.1186/s13059-015-0617-1.
2
arrayMap 2014: an updated cancer genome resource.ArrayMap 2014:一个更新的癌症基因组资源。
Nucleic Acids Res. 2015 Jan;43(Database issue):D825-30. doi: 10.1093/nar/gku1123. Epub 2014 Nov 26.
3
Genetic variation in human DNA replication timing.人类 DNA 复制时间的遗传变异。
Cell. 2014 Nov 20;159(5):1015-1026. doi: 10.1016/j.cell.2014.10.025. Epub 2014 Nov 13.
4
Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives.基于新一代测序数据的拷贝数变异(CNV)检测的计算工具:特点和展望。
BMC Bioinformatics. 2013;14 Suppl 11(Suppl 11):S1. doi: 10.1186/1471-2105-14-S11-S1. Epub 2013 Sep 13.
5
Use of autocorrelation scanning in DNA copy number analysis.使用自相关扫描进行 DNA 拷贝数分析。
Bioinformatics. 2013 Nov 1;29(21):2678-82. doi: 10.1093/bioinformatics/btt479. Epub 2013 Sep 16.
6
Systematic biases in DNA copy number originate from isolation procedures.DNA拷贝数的系统性偏差源于分离程序。
Genome Biol. 2013 Apr 24;14(4):R33. doi: 10.1186/gb-2013-14-4-r33.
7
NCBI GEO: archive for functional genomics data sets--update.NCBI GEO:功能基因组学数据集存档 - 更新。
Nucleic Acids Res. 2013 Jan;41(Database issue):D991-5. doi: 10.1093/nar/gks1193. Epub 2012 Nov 27.
8
arrayMap: a reference resource for genomic copy number imbalances in human malignancies.arrayMap:人类恶性肿瘤中基因组拷贝数失衡的参考资源。
PLoS One. 2012;7(5):e36944. doi: 10.1371/journal.pone.0036944. Epub 2012 May 18.
9
GC-content normalization for RNA-Seq data.RNA-Seq 数据的 GC 含量归一化。
BMC Bioinformatics. 2011 Dec 17;12:480. doi: 10.1186/1471-2105-12-480.
10
Quality control and quality assurance in genotypic data for genome-wide association studies.全基因组关联研究中基因型数据的质量控制和质量保证。
Genet Epidemiol. 2010 Sep;34(6):591-602. doi: 10.1002/gepi.20516.