• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用独特的分子标识符减少短串联重复序列位点的噪声和停顿。

Reducing noise and stutter in short tandem repeat loci with unique molecular identifiers.

机构信息

Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA; Department of Microbiology, Immunology and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA.

Center for Human Identification, University of North Texas Health Science Center, 3500 Camp Bowie Blvd., Fort Worth, TX 76107, USA.

出版信息

Forensic Sci Int Genet. 2021 Mar;51:102459. doi: 10.1016/j.fsigen.2020.102459. Epub 2020 Dec 25.

DOI:10.1016/j.fsigen.2020.102459
PMID:33429137
Abstract

Unique molecular identifiers (UMIs) are a promising approach to contend with errors generated during PCR and massively parallel sequencing (MPS). With UMI technology, random molecular barcodes are ligated to template DNA molecules prior to PCR, allowing PCR and sequencing error to be tracked and corrected bioinformatically. UMIs have the potential to be particularly informative for the interpretation of short tandem repeats (STRs). Traditional MPS approaches may simply lead to the observation of alleles that are consistent with the hypotheses of stutter, while with UMIs stutter products bioinformatically may be re-associated with their parental alleles and subsequently removed. Herein, a bioinformatics pipeline named strumi is described that is designed for the analysis of STRs that are tagged with UMIs. Unlike other tools, strumi is an alignment-free machine learning driven algorithm that clusters individual MPS reads into UMI families, infers consensus super-reads that represent each family and provides an estimate the resulting haplotype's accuracy. Super-reads, in turn, approximate independent measurements not of the PCR products, but of the original template molecules, both in terms of quantity and sequence identity. Provisional assessments show that naïve threshold-based approaches generate super-reads that are accurate (∼97 % haplotype accuracy, compared to ∼78 % when UMIs are not used), and the application of a more nuanced machine learning approach increases the accuracy to ∼99.5 % depending on the level of certainty desired. With these features, UMIs may greatly simplify probabilistic genotyping systems and reduce uncertainty. However, the ability to interpret alleles at trace levels also permits the interpretation, characterization and quantification of contamination as well as somatic variation (including somatic stutter), which may present newfound challenges.

摘要

独特分子标识符 (UMI) 是一种有前途的方法,可以解决 PCR 和大规模并行测序 (MPS) 过程中产生的错误。使用 UMI 技术,在 PCR 之前将随机分子条形码连接到模板 DNA 分子上,允许通过生物信息学跟踪和纠正 PCR 和测序错误。UMI 有可能为短串联重复序列 (STR) 的解释提供特别有价值的信息。传统的 MPS 方法可能只是导致观察到与突发假说一致的等位基因,而使用 UMI 则可以通过生物信息学将突发产物重新关联到其亲本等位基因上,然后将其去除。本文描述了一种名为 strumi 的生物信息学分析流程,该流程专为标记有 UMI 的 STR 分析而设计。与其他工具不同,strumi 是一种无比对的机器学习驱动算法,它将单个 MPS 读取聚类到 UMI 家族中,推断代表每个家族的共识超读取,并提供对所得单倍型准确性的估计。反过来,超读取近似于原始模板分子的独立测量,而不仅仅是 PCR 产物的独立测量,无论是在数量还是序列一致性方面。初步评估表明,基于阈值的简单方法生成的超读取是准确的(单倍型准确性约为 97%,而不使用 UMI 时约为 78%),并且应用更细致的机器学习方法可以根据所需的确定性水平将准确性提高到约 99.5%。有了这些特性,UMI 可以极大地简化概率基因分型系统并降低不确定性。然而,在痕量水平上解释等位基因的能力也允许对污染以及体细胞变异(包括体细胞突发)进行解释、特征描述和定量,这可能会带来新的挑战。

相似文献

1
Reducing noise and stutter in short tandem repeat loci with unique molecular identifiers.利用独特的分子标识符减少短串联重复序列位点的噪声和停顿。
Forensic Sci Int Genet. 2021 Mar;51:102459. doi: 10.1016/j.fsigen.2020.102459. Epub 2020 Dec 25.
2
Ultrasensitive sequencing of STR markers utilizing unique molecular identifiers and the SiMSen-Seq method.利用独特分子标识符和 SiMSen-Seq 方法进行 STR 标记的超灵敏测序。
Forensic Sci Int Genet. 2024 Jul;71:103047. doi: 10.1016/j.fsigen.2024.103047. Epub 2024 Apr 3.
3
Using unique molecular identifiers to improve allele calling in low-template mixtures.使用独特分子标识符改善低模板混合物中的等位基因分型
Forensic Sci Int Genet. 2023 Mar;63:102807. doi: 10.1016/j.fsigen.2022.102807. Epub 2022 Nov 24.
4
Characterizing stutter variants in forensic STRs with massively parallel sequencing.利用大规模平行测序技术对法医 STR 中的口吃变体进行特征描述。
Forensic Sci Int Genet. 2020 Mar;45:102225. doi: 10.1016/j.fsigen.2019.102225. Epub 2019 Dec 9.
5
Mixture deconvolution by massively parallel sequencing of microhaplotypes.通过微单倍型的大规模平行测序进行混合物反卷积
Int J Legal Med. 2019 May;133(3):719-729. doi: 10.1007/s00414-019-02010-7. Epub 2019 Feb 13.
6
Assessing non-LUS stutter in DNA sequence data.评估 DNA 序列数据中的非 LUS 口吃。
Forensic Sci Int Genet. 2022 Jul;59:102706. doi: 10.1016/j.fsigen.2022.102706. Epub 2022 Apr 16.
7
toaSTR: A web application for forensic STR genotyping by massively parallel sequencing.toaSTR:一款用于法医 STR 基因分型的网络应用程序,采用大规模并行测序技术。
Forensic Sci Int Genet. 2018 Nov;37:21-28. doi: 10.1016/j.fsigen.2018.07.006. Epub 2018 Jul 6.
8
Evaluation of ArmedXpert software tools, MixtureAce and Mixture Interpretation, to analyze MPS-STR data.评估ArmedXpert软件工具MixtureAce和Mixture Interpretation,以分析MPS-STR数据。
Forensic Sci Int Genet. 2022 Jan;56:102603. doi: 10.1016/j.fsigen.2021.102603. Epub 2021 Oct 12.
9
FDSTools: A software package for analysis of massively parallel sequencing data with the ability to recognise and correct STR stutter and other PCR or sequencing noise.FDSTools:一个用于分析大规模平行测序数据的软件包,能够识别并校正短串联重复序列(STR)滑动及其他聚合酶链式反应(PCR)或测序噪声。
Forensic Sci Int Genet. 2017 Mar;27:27-40. doi: 10.1016/j.fsigen.2016.11.007. Epub 2016 Nov 27.
10
Isometric artifacts from polymerase chain reaction-massively parallel sequencing analysis of short tandem repeat loci: An emerging issue from a new technology?聚合酶链反应-大规模平行测序分析短串联重复序列位点的等距伪影:新技术带来的新问题?
Electrophoresis. 2022 Jul;43(13-14):1521-1530. doi: 10.1002/elps.202100143. Epub 2022 May 11.

引用本文的文献

1
A detailed analysis of second and third-generation sequencing approaches for accurate length determination of short tandem repeats and homopolymers.用于精确测定短串联重复序列和同聚物长度的第二代和第三代测序方法的详细分析。
Nucleic Acids Res. 2025 Feb 27;53(5). doi: 10.1093/nar/gkaf131.
2
Digital sequencing is improved by using structured unique molecular identifiers.通过使用结构化的独特分子标识符可改进数字测序。
Genome Biol. 2025 Feb 25;26(1):37. doi: 10.1186/s13059-025-03504-x.
3
Navigating triplet repeats sequencing: concepts, methodological challenges and perspective for Huntington's disease.
解读三联体重复序列测序:概念、方法学挑战及亨廷顿舞蹈症研究前景
Nucleic Acids Res. 2025 Jan 7;53(1). doi: 10.1093/nar/gkae1155.
4
Age-dependent somatic expansion of the ATXN3 CAG repeat in the blood and buccal swab DNA of individuals with spinocerebellar ataxia type 3/Machado-Joseph disease.年龄相关的 ATXN3 CAG 重复在脊髓小脑共济失调 3 型/马查多-约瑟夫病个体的血液和口腔拭子 DNA 中的体细胞扩增。
Hum Genet. 2024 Nov;143(11):1363-1378. doi: 10.1007/s00439-024-02698-7. Epub 2024 Oct 8.
5
Short Tandem Repeat (STR) Profiling of Earwax DNA Obtained from Healthy Volunteers.从健康志愿者获取的耳垢DNA的短串联重复序列(STR)分析
Curr Issues Mol Biol. 2023 Jul 10;45(7):5741-5751. doi: 10.3390/cimb45070362.
6
A critical spotlight on the paradigms of FFPE-DNA sequencing.对 FFPE-DNA 测序范式的批判性关注。
Nucleic Acids Res. 2023 Aug 11;51(14):7143-7162. doi: 10.1093/nar/gkad519.
7
Applying Unique Molecular Indices with an Extensive All-in-One Forensic SNP Panel for Improved Genotype Accuracy and Sensitivity.应用具有广泛一体式法医 SNP 面板的独特分子指标,提高基因型准确性和灵敏度。
Genes (Basel). 2023 Mar 29;14(4):818. doi: 10.3390/genes14040818.
8
Precision DNA Mixture Interpretation with Single-Cell Profiling.单细胞分析的精准DNA混合物解读
Genes (Basel). 2021 Oct 20;12(11):1649. doi: 10.3390/genes12111649.
9
Noninvasive Prenatal Paternity Testing with a Combination of Well-Established SNP and STR Markers Using Massively Parallel Sequencing.利用高通量测序的 SNP 和 STR 标记联合进行无创性产前亲子鉴定。
Genes (Basel). 2021 Mar 22;12(3):454. doi: 10.3390/genes12030454.