• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

GtTR:基于序列捕获和高通量测序的绝对串联重复拷贝数的贝叶斯估计。

GtTR: Bayesian estimation of absolute tandem repeat copy number using sequence capture and high throughput sequencing.

机构信息

Institute for Molecular Biosciences, University of Queensland, Brisbane, Australia.

出版信息

BMC Bioinformatics. 2018 Jul 16;19(1):267. doi: 10.1186/s12859-018-2282-3.

DOI:10.1186/s12859-018-2282-3
PMID:30012093
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6048696/
Abstract

BACKGROUND

Tandem repeats comprise significant proportion of the human genome including coding and regulatory regions. They are highly prone to repeat number variation and nucleotide mutation due to their repetitive and unstable nature, making them a major source of genomic variation between individuals. Despite recent advances in high throughput sequencing, analysis of tandem repeats in the context of complex diseases is still hindered by technical limitations. We report a novel targeted sequencing approach, which allows simultaneous analysis of hundreds of repeats. We developed a Bayesian algorithm, namely - GtTR - which combines information from a reference long-read dataset with a short read counting approach to genotype tandem repeats at population scale. PCR sizing analysis was used for validation.

RESULTS

We used a PacBio long-read sequenced sample to generate a reference tandem repeat genotype dataset with on average 13% absolute deviation from PCR sizing results. Using this reference dataset GtTR generated estimates of VNTR copy number with accuracy within 95% high posterior density (HPD) intervals of 68 and 83% for capture sequence data and 200X WGS data respectively, improving to 87 and 94% with use of a PCR reference. We show that the genotype resolution increases as a function of depth, such that the median 95% HPD interval lies within 25, 14, 12 and 8% of the its midpoint copy number value for 30X, 200X WGS, 395X and 800X capture sequence data respectively. We validated nine targets by PCR sizing analysis and genotype estimates from sequencing results correlated well with PCR results.

CONCLUSIONS

The novel genotyping approach described here presents a new cost-effective method to explore previously unrecognized class of repeat variation in GWAS studies of complex diseases at the population level. Further improvements in accuracy can be obtained by improving accuracy of the reference dataset.

摘要

背景

串联重复序列构成了人类基因组的重要部分,包括编码区和调控区。由于其重复和不稳定的性质,它们非常容易发生重复数量的变化和核苷酸突变,这使得它们成为个体之间基因组变异的主要来源。尽管高通量测序技术取得了最近的进展,但在复杂疾病背景下分析串联重复序列仍然受到技术限制的阻碍。我们报告了一种新的靶向测序方法,该方法允许同时分析数百个重复序列。我们开发了一种贝叶斯算法,即 GtTR,它结合了来自参考长读数据集的信息和短读计数方法,以在群体水平上对串联重复序列进行基因分型。PCR 大小分析用于验证。

结果

我们使用 PacBio 长读测序样本生成了一个参考串联重复基因型数据集,其平均绝对偏差为 PCR 大小分析结果的 13%。使用该参考数据集,GtTR 生成的 VNTR 拷贝数估计值在捕获序列数据和 200X WGS 数据中的准确度分别在 68%和 83%的 95%高后验密度(HPD)区间内,使用 PCR 参考值可提高到 87%和 94%。我们表明,基因型分辨率随深度增加而增加,使得中位 95%HPD 区间在 30X、200X WGS、395X 和 800X 捕获序列数据中分别为其中点拷贝数值的 25%、14%、12%和 8%。我们通过 PCR 大小分析验证了九个靶标,测序结果的基因型估计与 PCR 结果相关性良好。

结论

这里描述的新型基因分型方法提供了一种新的具有成本效益的方法,可在复杂疾病的 GWAS 研究中探索以前未被识别的重复变异类别,达到群体水平。通过提高参考数据集的准确性,可以进一步提高准确性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/4e6bc0fcee15/12859_2018_2282_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/1d9e07b3eecf/12859_2018_2282_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/4699864f7eca/12859_2018_2282_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/9134d846e964/12859_2018_2282_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/a3b509146046/12859_2018_2282_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/2025f162031c/12859_2018_2282_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/87269d140fe8/12859_2018_2282_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/4d6f8789eb00/12859_2018_2282_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/c2e99a0b0f9e/12859_2018_2282_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/4e6bc0fcee15/12859_2018_2282_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/1d9e07b3eecf/12859_2018_2282_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/4699864f7eca/12859_2018_2282_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/9134d846e964/12859_2018_2282_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/a3b509146046/12859_2018_2282_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/2025f162031c/12859_2018_2282_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/87269d140fe8/12859_2018_2282_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/4d6f8789eb00/12859_2018_2282_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/c2e99a0b0f9e/12859_2018_2282_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/410a/6048696/4e6bc0fcee15/12859_2018_2282_Fig9_HTML.jpg

相似文献

1
GtTR: Bayesian estimation of absolute tandem repeat copy number using sequence capture and high throughput sequencing.GtTR:基于序列捕获和高通量测序的绝对串联重复拷贝数的贝叶斯估计。
BMC Bioinformatics. 2018 Jul 16;19(1):267. doi: 10.1186/s12859-018-2282-3.
2
NanoSatellite: accurate characterization of expanded tandem repeat length and sequence through whole genome long-read sequencing on PromethION.纳米卫星:通过 PromethION 全基因组长读测序准确表征扩展串联重复长度和序列。
Genome Biol. 2019 Nov 14;20(1):239. doi: 10.1186/s13059-019-1856-3.
3
Rapid multiplexed genotyping of simple tandem repeats using capture and high-throughput sequencing.利用捕获和高通量测序技术快速多重基因分型简单串联重复。
Hum Mutat. 2013 Sep;34(9):1304-11. doi: 10.1002/humu.22359. Epub 2013 Jun 17.
4
REViewer: haplotype-resolved visualization of read alignments in and around tandem repeats.REViewer:串联重复序列及其附近读取比对的单倍型解析可视化。
Genome Med. 2022 Aug 11;14(1):84. doi: 10.1186/s13073-022-01085-z.
5
RF: a method for filtering short reads with tandem repeats for genome mapping.RF:一种用于基因组图谱构建的带有串联重复的短读过滤方法。
Genomics. 2013 Jul;102(1):35-7. doi: 10.1016/j.ygeno.2013.03.002. Epub 2013 Mar 29.
6
Tandem-genotypes: robust detection of tandem repeat expansions from long DNA reads.串联基因型:从长 DNA 读取中稳健检测串联重复扩展。
Genome Biol. 2019 Mar 19;20(1):58. doi: 10.1186/s13059-019-1667-6.
7
Digital genotyping of macrosatellites and multicopy genes reveals novel biological functions associated with copy number variation of large tandem repeats.巨卫星和多拷贝基因的数字基因分型揭示了与大串联重复序列拷贝数变异相关的新生物学功能。
PLoS Genet. 2014 Jun 19;10(6):e1004418. doi: 10.1371/journal.pgen.1004418. eCollection 2014 Jun.
8
TRiCoLOR: tandem repeat profiling using whole-genome long-read sequencing data.TRiCoLOR:使用全基因组长读测序数据进行串联重复分析。
Gigascience. 2020 Oct 7;9(10). doi: 10.1093/gigascience/giaa101.
9
Inferring short tandem repeat variation from paired-end short reads.从双端短读序列推断短串联重复序列变异。
Nucleic Acids Res. 2014 Feb;42(3):e16. doi: 10.1093/nar/gkt1313. Epub 2013 Dec 17.
10
Genotyping of Mycobacterium tuberculosis spreading in Hanoi, Vietnam using conventional and whole genome sequencing methods.采用常规和全基因组测序方法对越南河内流行的结核分枝杆菌进行基因分型。
Infect Genet Evol. 2020 Mar;78:104107. doi: 10.1016/j.meegid.2019.104107. Epub 2019 Nov 6.

引用本文的文献

1
Recent advances in the detection of repeat expansions with short-read next-generation sequencing.利用短读长新一代测序技术检测重复序列扩增的最新进展。
F1000Res. 2018 Jun 13;7. doi: 10.12688/f1000research.13980.1. eCollection 2018.

本文引用的文献

1
Targeted genotyping of variable number tandem repeats with adVNTR.使用 adVNTR 进行可变数目串联重复序列的靶向基因分型。
Genome Res. 2018 Nov;28(11):1709-1719. doi: 10.1101/gr.235119.118. Epub 2018 Oct 23.
2
Simulating the dynamics of targeted capture sequencing with CapSim.使用 CapSim 模拟靶向捕获测序的动力学。
Bioinformatics. 2018 Mar 1;34(5):873-874. doi: 10.1093/bioinformatics/btx691.
3
Detection of long repeat expansions from PCR-free whole-genome sequence data.从无 PCR 全基因组序列数据中检测长重复扩展。
Genome Res. 2017 Nov;27(11):1895-1903. doi: 10.1101/gr.225672.117. Epub 2017 Sep 8.
4
A reference data set of 5.4 million phased human variants validated by genetic inheritance from sequencing a three-generation 17-member pedigree.通过对一个包含17名成员的三代家系进行测序,经遗传继承验证的540万个定相人类变异的参考数据集。
Genome Res. 2017 Jan;27(1):157-164. doi: 10.1101/gr.210500.116. Epub 2016 Nov 30.
5
Digital fragment analysis of short tandem repeats by high-throughput amplicon sequencing.通过高通量扩增子测序对短串联重复序列进行数字片段分析。
Ecol Evol. 2016 Jun 8;6(13):4502-12. doi: 10.1002/ece3.2221. eCollection 2016 Jul.
6
Abundant contribution of short tandem repeats to gene expression variation in humans.短串联重复序列对人类基因表达变异的巨大贡献。
Nat Genet. 2016 Jan;48(1):22-9. doi: 10.1038/ng.3461. Epub 2015 Dec 7.
7
A global reference for human genetic variation.人类遗传变异的全球参考。
Nature. 2015 Oct 1;526(7571):68-74. doi: 10.1038/nature15393.
8
Hybrid de novo tandem repeat detection using short and long reads.使用短读长和长读长的混合从头串联重复序列检测
BMC Med Genomics. 2015;8 Suppl 3(Suppl 3):S5. doi: 10.1186/1755-8794-8-S3-S5. Epub 2015 Sep 23.
9
Assembly and diploid architecture of an individual human genome via single-molecule technologies.通过单分子技术构建单个人类基因组的组装与二倍体结构
Nat Methods. 2015 Aug;12(8):780-6. doi: 10.1038/nmeth.3454. Epub 2015 Jun 29.
10
Digital genotyping of macrosatellites and multicopy genes reveals novel biological functions associated with copy number variation of large tandem repeats.巨卫星和多拷贝基因的数字基因分型揭示了与大串联重复序列拷贝数变异相关的新生物学功能。
PLoS Genet. 2014 Jun 19;10(6):e1004418. doi: 10.1371/journal.pgen.1004418. eCollection 2014 Jun.