Suppr超能文献

合并基于扩增子的高通量第二代和第三代测序数据:用于单倍型预测和输出评估的综合模块化数据分析框架

Merging High-Throughput, Amplicon-Based Second and Third Generation Sequencing Data: An Integrative and Modular Data Analysis Framework for Haplotype Prediction and Output Evaluation.

作者信息

Mink Sylvia, Attenberger Christian, Busch Yannik, Kiefer Johanna, Peter Wolfgang, Cadamuro Janne, Steiert Tim A, Franke Andre, Gassner Christoph

机构信息

Central Medical Laboratories, Carinagasse 41, 6800 Feldkirch, Austria.

Institute of Translational Medicine, Private University in the Principality of Liechtenstein, 9495 Triesen, Liechtenstein.

出版信息

Int J Mol Sci. 2025 Apr 7;26(7):3443. doi: 10.3390/ijms26073443.

Abstract

Despite providing highly accurate results, the short reads generated by second generation sequencing have major limitations in mapping complex genomic regions. Longer reads can resolve these issues and additionally phase distant variants. The third generation sequencing platform ONT currently achieves the longest sequencing reads but falls short in sequencing accuracy. Additionally, deriving phased haplotypes from amplicon-based NGS data remains a complex and time-consuming task that requires extensive bioinformatic expertise. We constructed an integrative, open-access modular data-analysis framework that allows for automated processing of high-throughput sequencing data from both second (Illumina) and third generation (ONT) sequencing platforms, combining the strengths of both technologies. Variant information is automatically evaluated and color-coded for discrepancies. Haplotypes are listed by frequency. All parts of the framework can be used independently. The framework's performance was validated using synthetic and tested with real-life data by analyzing partly homologous // sequencing data from 400 blood donors.

摘要

尽管第二代测序产生的短读长能提供高度准确的结果,但在绘制复杂基因组区域时存在重大局限性。更长的读长可以解决这些问题,还能对远距离变异进行定相。第三代测序平台ONT目前能实现最长的测序读长,但测序准确性不足。此外,从基于扩增子的NGS数据中推导定相单倍型仍然是一项复杂且耗时的任务,需要广泛的生物信息学专业知识。我们构建了一个集成的、开放获取的模块化数据分析框架,该框架允许对来自第二代(Illumina)和第三代(ONT)测序平台的高通量测序数据进行自动化处理,结合了两种技术的优势。变异信息会自动评估,并针对差异进行颜色编码。单倍型按频率列出。框架的所有部分都可以独立使用。通过分析400名献血者的部分同源测序数据,使用合成数据对该框架的性能进行了验证,并使用实际数据进行了测试。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f641/11990026/a8cbecc0b14e/ijms-26-03443-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验