• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PBHoney:通过长读段不一致性和中断映射识别基因组变异体。

PBHoney: identifying genomic variants via long-read discordance and interrupted mapping.

作者信息

English Adam C, Salerno William J, Reid Jeffrey G

机构信息

Human Genome Sequencing Center at Baylor College of Medicine, One Baylor Plaza, Houston 77030, Texas, USA.

出版信息

BMC Bioinformatics. 2014 Jun 10;15:180. doi: 10.1186/1471-2105-15-180.

DOI:10.1186/1471-2105-15-180
PMID:24915764
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4082283/
Abstract

BACKGROUND

As resequencing projects become more prevalent across a larger number of species, accurate variant identification will further elucidate the nature of genetic diversity and become increasingly relevant in genomic studies. However, the identification of larger genomic variants via DNA sequencing is limited by both the incomplete information provided by sequencing reads and the nature of the genome itself. Long-read sequencing technologies provide high-resolution access to structural variants often inaccessible to shorter reads.

RESULTS

We present PBHoney, software that considers both intra-read discordance and soft-clipped tails of long reads (>10,000 bp) to identify structural variants. As a proof of concept, we identify four structural variants and two genomic features in a strain of Escherichia coli with PBHoney and validate them via de novo assembly. PBHoney is available for download at http://sourceforge.net/projects/pb-jelly/.

CONCLUSIONS

Implementing two variant-identification approaches that exploit the high mappability of long reads, PBHoney is demonstrated as being effective at detecting larger structural variants using whole-genome Pacific Biosciences RS II Continuous Long Reads. Furthermore, PBHoney is able to discover two genomic features: the existence of Rac-Phage in isolate; evidence of E. coli's circular genome.

摘要

背景

随着重测序项目在越来越多的物种中变得更加普遍,准确的变异识别将进一步阐明遗传多样性的本质,并在基因组研究中变得越来越重要。然而,通过DNA测序识别较大的基因组变异受到测序读数提供的不完整信息以及基因组本身性质的限制。长读长测序技术能够高分辨率地获取短读长通常无法触及的结构变异。

结果

我们展示了PBHoney软件,该软件通过考虑长读长(>10,000 bp)的读内不一致性和软剪切末端来识别结构变异。作为概念验证,我们使用PBHoney在一株大肠杆菌中识别出四个结构变异和两个基因组特征,并通过从头组装对它们进行了验证。PBHoney可从http://sourceforge.net/projects/pb-jelly/下载。

结论

PBHoney实施了两种利用长读长高可映射性的变异识别方法,经证明它能有效地使用全基因组太平洋生物科学公司RS II连续长读长检测更大的结构变异。此外,PBHoney能够发现两个基因组特征:分离株中Rac噬菌体的存在;大肠杆菌环形基因组的证据。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/660d/4082283/2af25651879a/1471-2105-15-180-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/660d/4082283/cb8fa481f30c/1471-2105-15-180-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/660d/4082283/2af25651879a/1471-2105-15-180-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/660d/4082283/cb8fa481f30c/1471-2105-15-180-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/660d/4082283/2af25651879a/1471-2105-15-180-2.jpg

相似文献

1
PBHoney: identifying genomic variants via long-read discordance and interrupted mapping.PBHoney:通过长读段不一致性和中断映射识别基因组变异体。
BMC Bioinformatics. 2014 Jun 10;15:180. doi: 10.1186/1471-2105-15-180.
2
SInC: an accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data.SInC:一种准确且快速的基于错误模型的 SNP、Indel 和 CNV 模拟器,结合了用于短读序列数据的读取生成器。
BMC Bioinformatics. 2014 Feb 5;15:40. doi: 10.1186/1471-2105-15-40.
3
SVIM: structural variant identification using mapped long reads.SVIM:基于比对的长读段的结构变异识别。
Bioinformatics. 2019 Sep 1;35(17):2907-2915. doi: 10.1093/bioinformatics/btz041.
4
Characterization of MinION nanopore data for resequencing analyses.用于重测序分析的 MinION 纳米孔数据特征描述。
Brief Bioinform. 2017 Nov 1;18(6):940-953. doi: 10.1093/bib/bbw077.
5
Identification of indels in next-generation sequencing data.下一代测序数据中插入缺失的鉴定。
BMC Bioinformatics. 2015 Feb 13;16(1):42. doi: 10.1186/s12859-015-0483-6.
6
Sensitive alignment using paralogous sequence variants improves long-read mapping and variant calling in segmental duplications.利用直系同源序列变异进行敏感比对可提高大片段重复区域的长读长序列比对和变异calling 效率。
Nucleic Acids Res. 2020 Nov 4;48(19):e114. doi: 10.1093/nar/gkaa829.
7
Improving the sensitivity of long read overlap detection using grouped short k-mer matches.利用分组短 k-mer 匹配提高长读重叠检测的灵敏度。
BMC Genomics. 2019 Apr 4;20(Suppl 2):190. doi: 10.1186/s12864-019-5475-x.
8
RepLong: de novo repeat identification using long read sequencing data.RepLong:利用长读测序数据进行从头重复识别。
Bioinformatics. 2018 Apr 1;34(7):1099-1107. doi: 10.1093/bioinformatics/btx717.
9
Detecting genomic indel variants with exact breakpoints in single- and paired-end sequencing data using SplazerS.使用 SplazerS 在单端和配对末端测序数据中检测具有精确断点的基因组插入缺失变体。
Bioinformatics. 2012 Mar 1;28(5):619-27. doi: 10.1093/bioinformatics/bts019. Epub 2012 Jan 11.
10
NucBreak: location of structural errors in a genome assembly by using paired-end Illumina reads.NucBreak:利用 Illumina 配对末端读取来定位基因组组装中的结构错误。
BMC Bioinformatics. 2020 Feb 21;21(1):66. doi: 10.1186/s12859-020-3414-0.

引用本文的文献

1
Structural Variants: Mechanisms, Mapping, and Interpretation in Human Genetics.结构变异:人类遗传学中的机制、定位与解读
Genes (Basel). 2025 Jul 29;16(8):905. doi: 10.3390/genes16080905.
2
ASVBM: Structural variant benchmarking with local joint analysis for multiple callsets.ASVBM:通过对多个数据集进行局部联合分析的结构变异基准测试
Comput Struct Biotechnol J. 2025 Jun 29;27:2851-2862. doi: 10.1016/j.csbj.2025.06.045. eCollection 2025.
3
Epigenetic phase variation in the gut microbiome enhances bacterial adaptation.肠道微生物群中的表观遗传相变增强细菌适应性。

本文引用的文献

1
Comparing a few SNP calling algorithms using low-coverage sequencing data.比较几种使用低覆盖度测序数据的 SNP calling 算法。
BMC Bioinformatics. 2013 Sep 17;14:274. doi: 10.1186/1471-2105-14-274.
2
Impacts of variation in the human genome on gene regulation.人类基因组变异对基因调控的影响。
J Mol Biol. 2013 Nov 1;425(21):3970-7. doi: 10.1016/j.jmb.2013.07.015. Epub 2013 Jul 16.
3
The Growing Importance of CNVs: New Insights for Detection and Clinical Interpretation.CNVs 的重要性日益增加:检测和临床解读的新见解。
bioRxiv. 2025 Mar 26:2025.01.11.632565. doi: 10.1101/2025.01.11.632565.
4
GDBr: genomic signature interpretation tool for DNA double-strand break repair mechanisms.GDBr:用于DNA双链断裂修复机制的基因组特征解释工具。
Nucleic Acids Res. 2025 Jan 11;53(2). doi: 10.1093/nar/gkae1295.
5
A stepwise guide for pangenome development in crop plants: an alfalfa (Medicago sativa) case study.作物泛基因组开发的分步指南:以紫花苜蓿(Medicago sativa)为例。
BMC Genomics. 2024 Oct 31;25(1):1022. doi: 10.1186/s12864-024-10931-w.
6
FindCSV: a long-read based method for detecting complex structural variations.FindCSV:一种基于长读测序的复杂结构变异检测方法。
BMC Bioinformatics. 2024 Sep 28;25(1):315. doi: 10.1186/s12859-024-05937-w.
7
Comprehensive assessment of long-read sequencing platforms and calling algorithms for detection of copy number variation.长读测序平台和拷贝数变异检测调用算法的综合评估。
Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae441.
8
Tradeoffs in alignment and assembly-based methods for structural variant detection with long-read sequencing data.基于比对和组装的方法在长读测序数据结构变异检测中的权衡。
Nat Commun. 2024 Mar 19;15(1):2447. doi: 10.1038/s41467-024-46614-z.
9
SVvalidation: A long-read-based validation method for genomic structural variation.SVvalidation:一种基于长读测序的基因组结构变异验证方法。
PLoS One. 2024 Jan 5;19(1):e0291741. doi: 10.1371/journal.pone.0291741. eCollection 2024.
10
Application of third-generation sequencing in cancer research.第三代测序技术在癌症研究中的应用。
Med Rev (2021). 2021 Oct 21;1(2):150-171. doi: 10.1515/mr-2021-0013. eCollection 2021 Dec.
Front Genet. 2013 May 30;4:92. doi: 10.3389/fgene.2013.00092. eCollection 2013.
4
Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data.非杂交、基于长读长 SMRT 测序数据的完成微生物基因组组装。
Nat Methods. 2013 Jun;10(6):563-9. doi: 10.1038/nmeth.2474. Epub 2013 May 5.
5
A survey of tools for variant analysis of next-generation genome sequencing data.下一代基因组测序数据变异分析工具综述。
Brief Bioinform. 2014 Mar;15(2):256-78. doi: 10.1093/bib/bbs086. Epub 2013 Jan 21.
6
Sequencing the unsequenceable: expanded CGG-repeat alleles of the fragile X gene.测序无法测序的序列:脆性 X 基因的扩展 CGG 重复等位基因。
Genome Res. 2013 Jan;23(1):121-8. doi: 10.1101/gr.141705.112. Epub 2012 Oct 11.
7
Mapping single molecule sequencing reads using basic local alignment with successive refinement (BLASR): application and theory.使用带有连续精炼的基本局部比对(BLASR)对单分子测序reads 进行映射:应用与理论。
BMC Bioinformatics. 2012 Sep 19;13:238. doi: 10.1186/1471-2105-13-238.
8
DELLY: structural variant discovery by integrated paired-end and split-read analysis.DELLY:通过整合的 paired-end 和 split-read 分析进行结构变异发现。
Bioinformatics. 2012 Sep 15;28(18):i333-i339. doi: 10.1093/bioinformatics/bts378.
9
YAHA: fast and flexible long-read alignment with optimal breakpoint detection.YAHA:快速灵活的长读比对,具有最佳断点检测功能。
Bioinformatics. 2012 Oct 1;28(19):2417-24. doi: 10.1093/bioinformatics/bts456. Epub 2012 Jul 24.
10
Copy number variation detection and genotyping from exome sequence data.外显子组序列数据中的拷贝数变异检测和基因分型。
Genome Res. 2012 Aug;22(8):1525-32. doi: 10.1101/gr.138115.112. Epub 2012 May 14.