• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ScatTR:从短读长估计长串联重复序列的扩增大小。

ScatTR: Estimating the Size of Long Tandem Repeat Expansions from Short-Reads.

作者信息

Al-Abri Rashid, Gürsoy Gamze

机构信息

Department of Computer Science, Columbia University, New York, USA.

New York Genome Center, New York, USA.

出版信息

bioRxiv. 2025 Feb 20:2025.02.15.638440. doi: 10.1101/2025.02.15.638440.

DOI:10.1101/2025.02.15.638440
PMID:40027646
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11870476/
Abstract

Tandem repeats (TRs) are sequences of DNA where two or more base pairs are repeated back-to-back at specific locations in the genome. The expansions of TRs are implicated in over 50 conditions, including Friedreich's ataxia, autism, and cancer. However, accurately measuring the copy number of TRs is challenging, especially when their expansions are larger than the fragment sizes used in standard short-read genome sequencing. Here we introduce ScatTR, a novel computational method that leverages a maximum likelihood framework to estimate the copy number of large TR expansions from short-read sequencing data. ScatTR calculates the likelihood of different alignments between sequencing reads and reference sequences that represent various TR lengths and employs a Monte Carlo technique to find the best match. In simulated data, ScatTR outperforms state-of-the-art methods, particularly for TRs with longer motifs and those with lengths that greatly exceed typical sequencing fragment sizes. When applied to data from the 1000 Genomes Project, ScatTR detected potential large TR expansions that other methods missed, highlighting its ability to better identify genome-wide characterization of TR variation. ScatTR can be accessed via: https://github.com/g2lab/scattr.

摘要

串联重复序列(TRs)是基因组中特定位置上两个或更多碱基对首尾相连重复出现的DNA序列。TRs的扩增与50多种疾病相关,包括弗里德赖希共济失调、自闭症和癌症。然而,准确测量TRs的拷贝数具有挑战性,尤其是当它们的扩增大于标准短读长基因组测序中使用的片段大小时。在此,我们介绍了ScatTR,这是一种新颖的计算方法,它利用最大似然框架从短读长测序数据中估计大型TR扩增的拷贝数。ScatTR计算测序读段与代表各种TR长度的参考序列之间不同比对的似然性,并采用蒙特卡罗技术来找到最佳匹配。在模拟数据中,ScatTR优于现有方法,特别是对于具有较长基序的TRs以及长度大大超过典型测序片段大小的TRs。当应用于千人基因组计划的数据时,ScatTR检测到了其他方法遗漏的潜在大型TR扩增,突出了其更好地识别TR变异全基因组特征的能力。可通过以下链接访问ScatTR:https://github.com/g2lab/scattr。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae1a/11870476/4e094463b00e/nihpp-2025.02.15.638440v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae1a/11870476/07b8b018afc9/nihpp-2025.02.15.638440v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae1a/11870476/14f50a848812/nihpp-2025.02.15.638440v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae1a/11870476/27b06397ec04/nihpp-2025.02.15.638440v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae1a/11870476/fbb4d572bfaf/nihpp-2025.02.15.638440v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae1a/11870476/4e094463b00e/nihpp-2025.02.15.638440v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae1a/11870476/07b8b018afc9/nihpp-2025.02.15.638440v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae1a/11870476/14f50a848812/nihpp-2025.02.15.638440v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae1a/11870476/27b06397ec04/nihpp-2025.02.15.638440v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae1a/11870476/fbb4d572bfaf/nihpp-2025.02.15.638440v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ae1a/11870476/4e094463b00e/nihpp-2025.02.15.638440v1-f0005.jpg

相似文献

1
ScatTR: Estimating the Size of Long Tandem Repeat Expansions from Short-Reads.ScatTR:从短读长估计长串联重复序列的扩增大小。
bioRxiv. 2025 Feb 20:2025.02.15.638440. doi: 10.1101/2025.02.15.638440.
2
Estimating the size of long tandem repeat expansions from short reads with ScatTR.使用ScatTR从短读长估计长串联重复序列扩增的大小。
Genome Res. 2025 Aug 21. doi: 10.1101/gr.280563.125.
3
Prescription of Controlled Substances: Benefits and Risks管制药品的处方:益处与风险
4
Reanalysis of Next-Generation Sequencing Data to Detect Tandem Repeat Expansions in 1,106 Czech Probands With Neurologic Disease.重新分析下一代测序数据以检测1106名捷克神经疾病先证者中的串联重复序列扩增
Neurol Genet. 2025 Jun 25;11(4):e200272. doi: 10.1212/NXG.0000000000200272. eCollection 2025 Aug.
5
Aspects of Genetic Diversity, Host Specificity and Public Health Significance of Single-Celled Intestinal Parasites Commonly Observed in Humans and Mostly Referred to as 'Non-Pathogenic'.人类常见且大多被称为“非致病性”的单细胞肠道寄生虫的遗传多样性、宿主特异性及公共卫生意义
APMIS. 2025 Sep;133(9):e70036. doi: 10.1111/apm.70036.
6
Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.液体活检能否通过低深度全基因组测序检测肉瘤患者的循环肿瘤DNA?一项初步评估。
Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21.
7
Diagnostic test accuracy and cost-effectiveness of tests for codeletion of chromosomal arms 1p and 19q in people with glioma.染色体臂 1p 和 19q 缺失的检测在胶质瘤患者中的诊断准确性和成本效益。
Cochrane Database Syst Rev. 2022 Mar 2;3(3):CD013387. doi: 10.1002/14651858.CD013387.pub2.
8
Sexual Harassment and Prevention Training性骚扰与预防培训
9
Long-Range PCR and Nanopore Sequencing Enables High-Throughput Detection of TCF4 Trinucleotide Repeat Expansions in Fuchs Endothelial Corneal Dystrophy.长程PCR和纳米孔测序实现了对富克斯内皮性角膜营养不良中TCF4三核苷酸重复序列扩增的高通量检测。
Mol Diagn Ther. 2025 Jul 28. doi: 10.1007/s40291-025-00803-8.
10
Enrichment of tandem repeat element variants near CHD genes identified by short- and long-read genome sequencing.通过短读长和长读长基因组测序鉴定的冠心病基因附近串联重复元件变异的富集。
BMC Med Genomics. 2025 Jul 25;18(1):120. doi: 10.1186/s12920-025-02191-8.

本文引用的文献

1
Complex genetic variation in nearly complete human genomes.近乎完整的人类基因组中的复杂遗传变异。
Nature. 2025 Jul 23. doi: 10.1038/s41586-025-09140-6.
2
Role of the repeat expansion size in predicting age of onset and severity in RFC1 disease.重复扩增大小在预测RFC1疾病发病年龄和严重程度中的作用。
Brain. 2024 May 3;147(5):1887-1898. doi: 10.1093/brain/awad436.
3
STRling: a k-mer counting approach that detects short tandem repeat expansions at known and novel loci.STRling:一种用于检测已知和新基因座短串联重复扩展的 k- 碱基计数方法。
Genome Biol. 2022 Dec 14;23(1):257. doi: 10.1186/s13059-022-02826-4.
4
Recurrent repeat expansions in human cancer genomes.人类癌症基因组中的重复重复扩展。
Nature. 2023 Jan;613(7942):96-102. doi: 10.1038/s41586-022-05515-1. Epub 2022 Dec 14.
5
High-coverage whole-genome sequencing of the expanded 1000 Genomes Project cohort including 602 trios.对扩展的 1000 基因组项目队列进行高覆盖率全基因组测序,包括 602 个三核苷酸重复序列。
Cell. 2022 Sep 1;185(18):3426-3440.e19. doi: 10.1016/j.cell.2022.08.004.
6
The complete sequence of a human genome.人类基因组的完整序列。
Science. 2022 Apr;376(6588):44-53. doi: 10.1126/science.abj6987. Epub 2022 Mar 31.
7
Neurodegenerative diseases associated with non-coding CGG tandem repeat expansions.与非编码 CGG 串联重复扩展相关的神经退行性疾病。
Nat Rev Neurol. 2022 Mar;18(3):145-157. doi: 10.1038/s41582-021-00612-7. Epub 2022 Jan 12.
8
Genome-wide detection of tandem DNA repeats that are expanded in autism.全基因组检测在孤独症中扩增的串联 DNA 重复。
Nature. 2020 Oct;586(7827):80-86. doi: 10.1038/s41586-020-2579-z. Epub 2020 Jul 27.
9
ExpansionHunter Denovo: a computational method for locating known and novel repeat expansions in short-read sequencing data.ExpansionHunter Denovo:一种在短读测序数据中定位已知和新的重复扩展的计算方法。
Genome Biol. 2020 Apr 28;21(1):102. doi: 10.1186/s13059-020-02017-z.
10
Profiling the genome-wide landscape of tandem repeat expansions.全基因组串联重复扩展图谱分析。
Nucleic Acids Res. 2019 Sep 5;47(15):e90. doi: 10.1093/nar/gkz501.