• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

长读测序数据中全基因组扩增产生嵌合序列的探索。

Exploration of whole genome amplification generated chimeric sequences in long-read sequencing data.

机构信息

State Key Laboratory of Bioelectronics, School of Biological Science and Medical Engineering, Southeast University, Nanjing 210096, China.

Monash University-Southeast University Joint Research Institute, Suzhou 215123, China.

出版信息

Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad275.

DOI:10.1093/bib/bbad275
PMID:37529913
Abstract

MOTIVATION

Multiple displacement amplification (MDA) has become the most commonly used method of whole genome amplification, generating a vast amount of DNA with higher molecular weight and greater genome coverage. Coupling with long-read sequencing, it is possible to sequence the amplicons of over 20 kb in length. However, the formation of chimeric sequences (chimeras, expressed as structural errors in sequencing data) in MDA seriously interferes with the bioinformatics analysis but its influence on long-read sequencing data is unknown.

RESULTS

We sequenced the phi29 DNA polymerase-mediated MDA amplicons on the PacBio platform and analyzed chimeras within the generated data. The 3rd-ChimeraMiner has been constructed as a pipeline for recognizing and restoring chimeras into the original structures in long-read sequencing data, improving the efficiency of using TGS data. Five long-read datasets and one high-fidelity long-read dataset with various amplification folds were analyzed. The result reveals that the mis-priming events in amplification are more frequently occurring than widely perceived, and the propor tion gradually accumulates from 42% to over 78% as the amplification continues. In total, 99.92% of recognized chimeric sequences were demonstrated to be artifacts, whose structures were wrongly formed in MDA instead of existing in original genomes. By restoring chimeras to their original structures, the vast majority of supplementary alignments that introduce false-positive structural variants are recycled, removing 97% of inversions on average and contributing to the analysis of structural variation in MDA-amplified samples. The impact of chimeras in long-read sequencing data analysis should be emphasized, and the 3rd-ChimeraMiner can help to quantify and reduce the influence of chimeras.

AVAILABILITY AND IMPLEMENTATION

The 3rd-ChimeraMiner is available on GitHub, https://github.com/dulunar/3rdChimeraMiner.

摘要

动机

多重置换扩增(MDA)已成为全基因组扩增最常用的方法,它可以产生分子量更高、基因组覆盖度更大的大量 DNA。与长读测序相结合,可以对 20kb 以上的扩增子进行测序。然而,MDA 中嵌合序列(嵌合体,在测序数据中表现为结构错误)的形成严重干扰了生物信息学分析,但它对长读测序数据的影响尚不清楚。

结果

我们在 PacBio 平台上对 phi29 DNA 聚合酶介导的 MDA 扩增子进行了测序,并分析了生成数据中的嵌合体。3rd-ChimeraMiner 已被构建为一种用于识别和还原长读测序数据中嵌合体的流水线,从而提高了 TGS 数据的利用效率。我们分析了五个长读数据集和一个具有不同扩增倍数的高保真长读数据集。结果表明,扩增中的错误引发事件比人们普遍认为的更为频繁,并且随着扩增的进行,错误引发事件的比例逐渐从 42%累积到 78%以上。总共识别出的 99.92%的嵌合序列被证明是人为产物,它们的结构是在 MDA 中错误形成的,而不是原始基因组中存在的。通过将嵌合体还原到其原始结构,可以回收大量引入假阳性结构变异的补充比对,平均消除 97%的倒位,有助于 MDA 扩增样本的结构变异分析。嵌合体在长读测序数据分析中的影响应引起重视,而 3rd-ChimeraMiner 可以帮助量化并减少嵌合体的影响。

可用性和实现

3rd-ChimeraMiner 可在 GitHub 上获得,网址为 https://github.com/dulunar/3rdChimeraMiner。

相似文献

1
Exploration of whole genome amplification generated chimeric sequences in long-read sequencing data.长读测序数据中全基因组扩增产生嵌合序列的探索。
Brief Bioinform. 2023 Sep 20;24(5). doi: 10.1093/bib/bbad275.
2
ChimeraMiner: An Improved Chimeric Read Detection Pipeline and Its Application in Single Cell Sequencing.ChimeraMiner:一种改进的嵌合体读段检测管道及其在单细胞测序中的应用。
Int J Mol Sci. 2019 Apr 21;20(8):1953. doi: 10.3390/ijms20081953.
3
Systematic Characteristic Exploration of the Chimeras Generated in Multiple Displacement Amplification through Next Generation Sequencing Data Reanalysis.通过下一代测序数据重新分析对多重置换扩增中产生的嵌合体进行系统特征探索。
PLoS One. 2015 Oct 6;10(10):e0139857. doi: 10.1371/journal.pone.0139857. eCollection 2015.
4
Chimera: The spoiler in multiple displacement amplification.嵌合体:多重置换扩增中的干扰因素
Comput Struct Biotechnol J. 2023 Feb 23;21:1688-1696. doi: 10.1016/j.csbj.2023.02.034. eCollection 2023.
5
RepLong: de novo repeat identification using long read sequencing data.RepLong:利用长读测序数据进行从头重复识别。
Bioinformatics. 2018 Apr 1;34(7):1099-1107. doi: 10.1093/bioinformatics/btx717.
6
Assessment of REPLI-g Multiple Displacement Whole Genome Amplification (WGA) Techniques for Metagenomic Applications.用于宏基因组学应用的REPLI-g多重置换全基因组扩增(WGA)技术评估
J Biomol Tech. 2017 Apr;28(1):46-55. doi: 10.7171/jbt.17-2801-008. Epub 2017 Mar 21.
7
Long-read metagenomics of multiple displacement amplified DNA of low-biomass human gut phageomes by SACRA pre-processing chimeric reads.通过 SACRA 预处理嵌合reads 对低生物量人肠道噬菌体组的多重置换扩增 DNA 进行长读长宏基因组学分析。
DNA Res. 2021 Oct 11;28(6). doi: 10.1093/dnares/dsab019.
8
SVsearcher: A more accurate structural variation detection method in long read data.SVsearcher:一种用于长读长数据中更准确的结构变异检测方法。
Comput Biol Med. 2023 May;158:106843. doi: 10.1016/j.compbiomed.2023.106843. Epub 2023 Mar 31.
9
A comprehensive investigation of metagenome assembly by linked-read sequencing.基于链接读取测序的宏基因组组装综合研究。
Microbiome. 2020 Nov 11;8(1):156. doi: 10.1186/s40168-020-00929-3.
10
Mechanism of chimera formation during the Multiple Displacement Amplification reaction.多重置换扩增反应过程中嵌合体形成的机制。
BMC Biotechnol. 2007 Apr 12;7:19. doi: 10.1186/1472-6750-7-19.

引用本文的文献

1
Impact of Packaging Methods on Physicochemical Properties, Flavor Profile, and Microbial Community in Low-Temperature Stored Mianning Ham.包装方法对低温贮藏的冕宁火腿理化性质、风味特征及微生物群落的影响
Foods. 2025 Jul 1;14(13):2336. doi: 10.3390/foods14132336.
2
SKSR1 identified as key virulence factor in Cryptosporidium by genetic crossing.通过基因杂交确定SKSR1为隐孢子虫的关键毒力因子。
Nat Commun. 2025 May 20;16(1):4694. doi: 10.1038/s41467-025-60088-7.
3
Multicopy subtelomeric genes underlie animal infectivity of divergent Cryptosporidium hominis subtypes.
多拷贝亚端粒基因是不同人隐孢子虫亚型动物感染性的基础。
Nat Commun. 2024 Dec 30;15(1):10774. doi: 10.1038/s41467-024-54995-4.
4
Multiple Displacement Amplification Facilitates SMRT Sequencing of Microscopic Animals and the Genome of the Gastrotrich Lepidodermella squamata (Dujardin 1841).多重置换扩增促进了微观动物的单分子实时测序以及腹毛动物鳞皮棘尾虫(杜雅尔丹,1841年)基因组的测序。
Genome Biol Evol. 2024 Dec 4;16(12). doi: 10.1093/gbe/evae254.
5
Single-cell somatic copy number variants in brain using different amplification methods and reference genomes.使用不同扩增方法和参考基因组检测大脑中的单细胞体细胞拷贝数变异。
Commun Biol. 2024 Oct 9;7(1):1288. doi: 10.1038/s42003-024-06940-w.
6
Low-input PacBio sequencing generates high-quality individual fly genomes and characterizes mutational processes.低投入 PacBio 测序生成高质量的个体果蝇基因组并阐明突变过程。
Nat Commun. 2024 Jul 5;15(1):5644. doi: 10.1038/s41467-024-49992-6.
7
Mapping recurrent mosaic copy number variation in human neurons.绘制人类神经元中反复出现的镶嵌拷贝数变异。
Nat Commun. 2024 May 17;15(1):4220. doi: 10.1038/s41467-024-48392-0.
8
FLED: a full-length eccDNA detector for long-reads sequencing data.FLED:一种用于长读测序数据的全长 eccDNA 检测器。
Brief Bioinform. 2023 Sep 22;24(6). doi: 10.1093/bib/bbad388.