• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于连锁不平衡的低覆盖度鸟枪法测序数据的基因型调用。

Linkage disequilibrium based genotype calling from low-coverage shotgun sequencing reads.

机构信息

Department of Computer Science & Engineering, University of Connecticut, 371 Fairfield Rd, Unit 2155, Storrs, CT 06269-2155, USA.

出版信息

BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S53. doi: 10.1186/1471-2105-12-S1-S53.

DOI:10.1186/1471-2105-12-S1-S53
PMID:21342586
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3044311/
Abstract

BACKGROUND

Recent technology advances have enabled sequencing of individual genomes, promising to revolutionize biomedical research. However, deep sequencing remains more expensive than microarrays for performing whole-genome SNP genotyping.

RESULTS

In this paper we introduce a new multi-locus statistical model and computationally efficient genotype calling algorithms that integrate shotgun sequencing data with linkage disequilibrium (LD) information extracted from reference population panels such as Hapmap or the 1000 genomes project. Experiments on publicly available 454, Illumina, and ABI SOLiD sequencing datasets suggest that integration of LD information results in genotype calling accuracy comparable to that of microarray platforms from sequencing data of low-coverage. A software package implementing our algorithm, released under the GNU General Public License, is available at http://dna.engr.uconn.edu/software/GeneSeq/.

CONCLUSIONS

Integration of LD information leads to significant improvements in genotype calling accuracy compared to prior LD-oblivious methods, rendering low-coverage sequencing as a viable alternative to microarrays for conducting large-scale genome-wide association studies.

摘要

背景

最近的技术进步使得对个体基因组进行测序成为可能,有望彻底改变生物医学研究。然而,深度测序在进行全基因组 SNP 基因分型方面仍然比微阵列昂贵。

结果

在本文中,我们介绍了一种新的多基因座统计模型和计算高效的基因型调用算法,该算法将测序数据与来自参考人群面板(如 Hapmap 或 1000 基因组计划)的连锁不平衡(LD)信息进行整合。对公开的 454、Illumina 和 ABI SOLiD 测序数据集进行的实验表明,整合 LD 信息可使基因型调用准确性与低覆盖测序的微阵列平台相当。我们的算法的软件包已在 GNU 通用公共许可证下发布,可在 http://dna.engr.uconn.edu/software/GeneSeq/ 上获得。

结论

与先前的 LD 忽略方法相比,整合 LD 信息可显著提高基因型调用准确性,从而使低覆盖测序成为进行大规模全基因组关联研究的微阵列的可行替代方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/850f94967bb0/1471-2105-12-S1-S53-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/d8b2471f3780/1471-2105-12-S1-S53-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/43008e66c798/1471-2105-12-S1-S53-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/50763ee946b8/1471-2105-12-S1-S53-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/11e56535b058/1471-2105-12-S1-S53-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/7ba9c08e47db/1471-2105-12-S1-S53-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/850f94967bb0/1471-2105-12-S1-S53-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/d8b2471f3780/1471-2105-12-S1-S53-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/43008e66c798/1471-2105-12-S1-S53-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/50763ee946b8/1471-2105-12-S1-S53-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/11e56535b058/1471-2105-12-S1-S53-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/7ba9c08e47db/1471-2105-12-S1-S53-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/84fd/3044311/850f94967bb0/1471-2105-12-S1-S53-6.jpg

相似文献

1
Linkage disequilibrium based genotype calling from low-coverage shotgun sequencing reads.基于连锁不平衡的低覆盖度鸟枪法测序数据的基因型调用。
BMC Bioinformatics. 2011 Feb 15;12 Suppl 1(Suppl 1):S53. doi: 10.1186/1471-2105-12-S1-S53.
2
Genotype calling from next-generation sequencing data using haplotype information of reads.基于读段单倍型信息进行下一代测序数据的基因型推断。
Bioinformatics. 2012 Apr 1;28(7):938-46. doi: 10.1093/bioinformatics/bts047. Epub 2012 Jan 27.
3
A computational method for genotype calling in family-based sequencing data.一种用于基于家系测序数据进行基因型分型的计算方法。
BMC Bioinformatics. 2016 Jan 16;17:37. doi: 10.1186/s12859-016-0880-5.
4
Leveraging reads that span multiple single nucleotide polymorphisms for haplotype inference from sequencing data.利用跨越多个单核苷酸多态性的读取信息,从测序数据中推断单倍型。
Bioinformatics. 2013 Sep 15;29(18):2245-52. doi: 10.1093/bioinformatics/btt386. Epub 2013 Jul 3.
5
Generation of SNP datasets for orangutan population genomics using improved reduced-representation sequencing and direct comparisons of SNP calling algorithms.利用改良的简化代表性测序和 SNP 调用算法的直接比较,生成猩猩群体基因组学的 SNP 数据集。
BMC Genomics. 2014 Jan 10;15:16. doi: 10.1186/1471-2164-15-16.
6
PhredEM: a phred-score-informed genotype-calling approach for next-generation sequencing studies.PhredEM:一种用于下一代测序研究的基于Phred分数的基因型分型方法。
Genet Epidemiol. 2017 Jul;41(5):375-387. doi: 10.1002/gepi.22048. Epub 2017 May 31.
7
Reveel: large-scale population genotyping using low-coverage sequencing data.Reveel:使用低覆盖度测序数据进行大规模人群基因分型。
Bioinformatics. 2016 Jun 1;32(11):1686-96. doi: 10.1093/bioinformatics/btv530. Epub 2015 Sep 9.
8
Dynamic model based algorithms for screening and genotyping over 100 K SNPs on oligonucleotide microarrays.基于动态模型的寡核苷酸微阵列上100K以上单核苷酸多态性(SNP)筛选和基因分型算法
Bioinformatics. 2005 May 1;21(9):1958-63. doi: 10.1093/bioinformatics/bti275. Epub 2005 Jan 18.
9
Fast individual ancestry inference from DNA sequence data leveraging allele frequencies for multiple populations.利用多个群体的等位基因频率从DNA序列数据中快速推断个体祖先。
BMC Bioinformatics. 2015 Jan 16;16:4. doi: 10.1186/s12859-014-0418-7.
10
An evaluation of sequencing coverage and genotyping strategies to assess neutral and adaptive diversity.评估测序覆盖度和基因分型策略,以评估中性和适应性多样性。
Mol Ecol Resour. 2019 Nov;19(6):1497-1515. doi: 10.1111/1755-0998.13070. Epub 2019 Sep 9.

引用本文的文献

1
Modeling Biases from Low-Pass Genome Sequencing to Enable Accurate Population Genetic Inferences.对低通量基因组测序中的偏差进行建模以实现准确的群体遗传推断。
Mol Biol Evol. 2025 Jan 6;42(1). doi: 10.1093/molbev/msaf002.
2
Modeling biases from low-pass genome sequencing to enable accurate population genetic inferences.对低通量基因组测序中的偏差进行建模,以实现准确的群体遗传推断。
bioRxiv. 2024 Jul 23:2024.07.19.604366. doi: 10.1101/2024.07.19.604366.
3
Fast imputation using medium or low-coverage sequence data.使用中等或低覆盖率序列数据进行快速插补。

本文引用的文献

1
SNP detection and genotyping from low-coverage sequencing data on multiple diploid samples.从多个二倍体样本的低覆盖测序数据中进行 SNP 检测和基因分型。
Genome Res. 2011 Jun;21(6):952-60. doi: 10.1101/gr.113084.110. Epub 2010 Oct 27.
2
Designing deep sequencing experiments: detecting structural variation and estimating transcript abundance.设计深度测序实验:检测结构变异和估计转录本丰度。
BMC Genomics. 2010 Jun 18;11:385. doi: 10.1186/1471-2164-11-385.
3
Genotype imputation for genome-wide association studies.全基因组关联研究中的基因型推断。
BMC Genet. 2015 Jul 14;16:82. doi: 10.1186/s12863-015-0243-7.
4
GINDEL: accurate genotype calling of insertions and deletions from low coverage population sequence reads.GINDEL:从低覆盖度群体序列读数中准确进行插入和缺失的基因型分型。
PLoS One. 2014 Nov 25;9(11):e113324. doi: 10.1371/journal.pone.0113324. eCollection 2014.
5
HapFABIA: identification of very short segments of identity by descent characterized by rare variants in large sequencing data.HapFABIA:通过在大型测序数据中鉴定罕见变异来识别具有同源性的非常短的片段。
Nucleic Acids Res. 2013 Dec;41(22):e202. doi: 10.1093/nar/gkt1013. Epub 2013 Oct 29.
6
Rare variant association testing under low-coverage sequencing.低覆盖度测序下的罕见变异关联测试。
Genetics. 2013 Jul;194(3):769-79. doi: 10.1534/genetics.113.150169. Epub 2013 May 1.
7
An integrative variant analysis pipeline for accurate genotype/haplotype inference in population NGS data.一种整合的变异分析管道,用于准确推断人群 NGS 数据中的基因型/单倍型。
Genome Res. 2013 May;23(5):833-42. doi: 10.1101/gr.146084.112. Epub 2013 Jan 7.
8
Genotype calling from next-generation sequencing data using haplotype information of reads.基于读段单倍型信息进行下一代测序数据的基因型推断。
Bioinformatics. 2012 Apr 1;28(7):938-46. doi: 10.1093/bioinformatics/bts047. Epub 2012 Jan 27.
Nat Rev Genet. 2010 Jul;11(7):499-511. doi: 10.1038/nrg2796.
4
Personal genome sequencing: current approaches and challenges.个人基因组测序:当前方法与挑战。
Genes Dev. 2010 Mar 1;24(5):423-31. doi: 10.1101/gad.1864110.
5
Complete Khoisan and Bantu genomes from southern Africa.完成来自南非的科伊桑和班图人的全基因组。
Nature. 2010 Feb 18;463(7283):943-7. doi: 10.1038/nature08795.
6
Simultaneous genotype calling and haplotype phasing improves genotype accuracy and reduces false-positive associations for genome-wide association studies.同时进行基因型调用和单倍型相位分析可提高全基因组关联研究的基因型准确性,并减少假阳性关联。
Am J Hum Genet. 2009 Dec;85(6):847-61. doi: 10.1016/j.ajhg.2009.11.004.
7
Human genome sequencing using unchained base reads on self-assembling DNA nanoarrays.基于自组装 DNA 纳米阵列的无链碱基读取进行人类基因组测序。
Science. 2010 Jan 1;327(5961):78-81. doi: 10.1126/science.1181498. Epub 2009 Nov 5.
8
Single-molecule sequencing of an individual human genome.对单个人类基因组进行单分子测序。
Nat Biotechnol. 2009 Sep;27(9):847-50. doi: 10.1038/nbt.1561. Epub 2009 Aug 10.
9
Sequence and structural variation in a human genome uncovered by short-read, massively parallel ligation sequencing using two-base encoding.通过使用双碱基编码的短读长、大规模平行连接测序揭示的人类基因组中的序列和结构变异。
Genome Res. 2009 Sep;19(9):1527-41. doi: 10.1101/gr.091868.109. Epub 2009 Jun 22.
10
A flexible and accurate genotype imputation method for the next generation of genome-wide association studies.一种用于下一代全基因组关联研究的灵活且准确的基因型填充方法。
PLoS Genet. 2009 Jun;5(6):e1000529. doi: 10.1371/journal.pgen.1000529. Epub 2009 Jun 19.