• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

合并基于扩增子的高通量第二代和第三代测序数据:用于单倍型预测和输出评估的综合模块化数据分析框架

Merging High-Throughput, Amplicon-Based Second and Third Generation Sequencing Data: An Integrative and Modular Data Analysis Framework for Haplotype Prediction and Output Evaluation.

作者信息

Mink Sylvia, Attenberger Christian, Busch Yannik, Kiefer Johanna, Peter Wolfgang, Cadamuro Janne, Steiert Tim A, Franke Andre, Gassner Christoph

机构信息

Central Medical Laboratories, Carinagasse 41, 6800 Feldkirch, Austria.

Institute of Translational Medicine, Private University in the Principality of Liechtenstein, 9495 Triesen, Liechtenstein.

出版信息

Int J Mol Sci. 2025 Apr 7;26(7):3443. doi: 10.3390/ijms26073443.

DOI:10.3390/ijms26073443
PMID:40244459
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11990026/
Abstract

Despite providing highly accurate results, the short reads generated by second generation sequencing have major limitations in mapping complex genomic regions. Longer reads can resolve these issues and additionally phase distant variants. The third generation sequencing platform ONT currently achieves the longest sequencing reads but falls short in sequencing accuracy. Additionally, deriving phased haplotypes from amplicon-based NGS data remains a complex and time-consuming task that requires extensive bioinformatic expertise. We constructed an integrative, open-access modular data-analysis framework that allows for automated processing of high-throughput sequencing data from both second (Illumina) and third generation (ONT) sequencing platforms, combining the strengths of both technologies. Variant information is automatically evaluated and color-coded for discrepancies. Haplotypes are listed by frequency. All parts of the framework can be used independently. The framework's performance was validated using synthetic and tested with real-life data by analyzing partly homologous // sequencing data from 400 blood donors.

摘要

尽管第二代测序产生的短读长能提供高度准确的结果,但在绘制复杂基因组区域时存在重大局限性。更长的读长可以解决这些问题,还能对远距离变异进行定相。第三代测序平台ONT目前能实现最长的测序读长,但测序准确性不足。此外,从基于扩增子的NGS数据中推导定相单倍型仍然是一项复杂且耗时的任务,需要广泛的生物信息学专业知识。我们构建了一个集成的、开放获取的模块化数据分析框架,该框架允许对来自第二代(Illumina)和第三代(ONT)测序平台的高通量测序数据进行自动化处理,结合了两种技术的优势。变异信息会自动评估,并针对差异进行颜色编码。单倍型按频率列出。框架的所有部分都可以独立使用。通过分析400名献血者的部分同源测序数据,使用合成数据对该框架的性能进行了验证,并使用实际数据进行了测试。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/379e48492ed2/ijms-26-03443-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/a8cbecc0b14e/ijms-26-03443-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/dd75ca2b687d/ijms-26-03443-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/ba6f8ffdf0b4/ijms-26-03443-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/1f0df440bbe2/ijms-26-03443-g004a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/379e48492ed2/ijms-26-03443-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/a8cbecc0b14e/ijms-26-03443-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/dd75ca2b687d/ijms-26-03443-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/ba6f8ffdf0b4/ijms-26-03443-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/1f0df440bbe2/ijms-26-03443-g004a.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/379e48492ed2/ijms-26-03443-g005.jpg

相似文献

1
Merging High-Throughput, Amplicon-Based Second and Third Generation Sequencing Data: An Integrative and Modular Data Analysis Framework for Haplotype Prediction and Output Evaluation.合并基于扩增子的高通量第二代和第三代测序数据:用于单倍型预测和输出评估的综合模块化数据分析框架
Int J Mol Sci. 2025 Apr 7;26(7):3443. doi: 10.3390/ijms26073443.
2
A Long-Read Sequencing Approach for Direct Haplotype Phasing in Clinical Settings.一种在临床环境中直接进行单体型定相的长读测序方法。
Int J Mol Sci. 2020 Dec 1;21(23):9177. doi: 10.3390/ijms21239177.
3
DCHap: A Divide-and-Conquer Haplotype Phasing Algorithm for Third-Generation Sequences.DCHap:一种用于第三代测序的分治单倍型相位算法。
IEEE/ACM Trans Comput Biol Bioinform. 2022 May-Jun;19(3):1277-1284. doi: 10.1109/TCBB.2020.3005673. Epub 2022 Jun 3.
4
Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data.利用跨越多个单核苷酸多态性的读取信息,从测序数据中推断单倍型。
Bioinformatics. 2013 Sep 15;29(18):2245-52. doi: 10.1093/bioinformatics/btt386. Epub 2013 Jul 3.
5
A complete pipeline enables haplotyping and phasing macrohaplotype in long sequencing reads for polyploidy samples and a multi-source DNA mixture.一个完整的流程能够对多倍体样本和多源DNA混合物的长测序读段进行单倍型分型和宏单倍型定相。
Electrophoresis. 2024 May;45(9-10):877-884. doi: 10.1002/elps.202300143. Epub 2024 Jan 9.
6
HapCUT2: A Method for Phasing Genomes Using Experimental Sequence Data.HapCUT2:一种使用实验序列数据进行基因组相位分析的方法。
Methods Mol Biol. 2023;2590:139-147. doi: 10.1007/978-1-0716-2819-5_9.
7
Pre-assembly NGS correction of ONT reads achieves HiFi-level assembly quality.对纳米孔测序(ONT)读数进行预组装的二代测序(NGS)校正可实现高保真度水平的组装质量。
Genome. 2025 Jan 1;68:1-9. doi: 10.1139/gen-2024-0132.
8
Accurate long-read sequencing allows assembly of the duplicated RHD and RHCE genes harboring variants relevant to blood transfusion.准确的长读测序可用于组装携带与输血相关变异的重复 RHD 和 RHCE 基因。
Am J Hum Genet. 2022 Jan 6;109(1):180-191. doi: 10.1016/j.ajhg.2021.12.003. Epub 2021 Dec 29.
9
Pitfalls of haplotype phasing from amplicon-based long-read sequencing.基于扩增子的长读长测序进行单倍型定相的陷阱
Sci Rep. 2016 Feb 17;6:21746. doi: 10.1038/srep21746.
10
AmpSeqR: an R package for amplicon deep sequencing data analysis.AmpSeqR:一个用于扩增子高通量测序数据分析的 R 包。
F1000Res. 2023 Mar 23;12:327. doi: 10.12688/f1000research.129581.1. eCollection 2023.

本文引用的文献

1
Guidelines for Evaluating the Comparability of Down-Sampled GWAS Summary Statistics.降采样 GWAS 汇总统计数据可比性评估指南。
Behav Genet. 2023 Nov;53(5-6):404-415. doi: 10.1007/s10519-023-10152-z. Epub 2023 Sep 15.
2
Genomics in the long-read sequencing era.长读测序时代的基因组学。
Trends Genet. 2023 Sep;39(9):649-671. doi: 10.1016/j.tig.2023.04.006. Epub 2023 May 23.
3
Species-specific basecallers improve actual accuracy of nanopore sequencing in plants.物种特异性碱基识别器提高了植物纳米孔测序的实际准确性。
Plant Methods. 2022 Dec 14;18(1):137. doi: 10.1186/s13007-022-00971-2.
4
Comparison of calling pipelines for whole genome sequencing: an empirical study demonstrating the importance of mapping and alignment.比较全基因组测序的调用管道:一项实证研究表明映射和比对的重要性。
Sci Rep. 2022 Dec 13;12(1):21502. doi: 10.1038/s41598-022-26181-3.
5
PBSIM3: a simulator for all types of PacBio and ONT long reads.PBSIM3:一款适用于所有类型的PacBio和ONT长读长的模拟器。
NAR Genom Bioinform. 2022 Dec 1;4(4):lqac092. doi: 10.1093/nargab/lqac092. eCollection 2022 Dec.
6
Database resources of the National Center for Biotechnology Information in 2023.2023 年国立生物技术信息中心的数据库资源。
Nucleic Acids Res. 2023 Jan 6;51(D1):D29-D38. doi: 10.1093/nar/gkac1032.
7
Nanopore adaptive sampling: a tool for enrichment of low abundance species in metagenomic samples.纳米孔自适应采样:一种用于宏基因组样本中低丰度物种富集的工具。
Genome Biol. 2022 Jan 24;23(1):11. doi: 10.1186/s13059-021-02582-x.
8
Sequencing DNA with nanopores: Troubles and biases.用纳米孔测序 DNA:问题和偏差。
PLoS One. 2021 Oct 1;16(10):e0257521. doi: 10.1371/journal.pone.0257521. eCollection 2021.
9
Towards population-scale long-read sequencing.迈向大规模长读长测序。
Nat Rev Genet. 2021 Sep;22(9):572-587. doi: 10.1038/s41576-021-00367-3. Epub 2021 May 28.
10
Twelve years of SAMtools and BCFtools.SAMtools 和 BCFtools 十二年。
Gigascience. 2021 Feb 16;10(2). doi: 10.1093/gigascience/giab008.