• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

选择额外的标签单核苷酸多态性以耐受基因分型中的缺失数据。

Selecting additional tag SNPs for tolerating missing data in genotyping.

作者信息

Huang Yao-Ting, Zhang Kui, Chen Ting, Chao Kun-Mao

机构信息

Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan.

出版信息

BMC Bioinformatics. 2005 Nov 1;6:263. doi: 10.1186/1471-2105-6-263.

DOI:10.1186/1471-2105-6-263
PMID:16259642
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1316880/
Abstract

BACKGROUND

Recent studies have shown that the patterns of linkage disequilibrium observed in human populations have a block-like structure, and a small subset of SNPs (called tag SNPs) is sufficient to distinguish each pair of haplotype patterns in the block. In reality, some tag SNPs may be missing, and we may fail to distinguish two distinct haplotypes due to the ambiguity caused by missing data.

RESULTS

We show there exists a subset of SNPs (referred to as robust tag SNPs) which can still distinguish all distinct haplotypes even when some SNPs are missing. The problem of finding minimum robust tag SNPs is shown to be NP-hard. To find robust tag SNPs efficiently, we propose two greedy algorithms and one linear programming relaxation algorithm. The experimental results indicate that (1) the solutions found by these algorithms are quite close to the optimal solution; (2) the genotyping cost saved by using tag SNPs can be as high as 80%; and (3) genotyping additional tag SNPs for tolerating missing data is still cost-effective.

CONCLUSION

Genotyping robust tag SNPs is more practical than just genotyping the minimum tag SNPs if we can not avoid the occurrence of missing data. Our theoretical analysis and experimental results show that the performance of our algorithms is not only efficient but the solution found is also close to the optimal solution.

摘要

背景

最近的研究表明,在人类群体中观察到的连锁不平衡模式具有块状结构,并且一小部分单核苷酸多态性(称为标签单核苷酸多态性)足以区分该块中的每对单倍型模式。在实际情况中,一些标签单核苷酸多态性可能缺失,并且由于缺失数据导致的模糊性,我们可能无法区分两种不同的单倍型。

结果

我们表明存在一个单核苷酸多态性子集(称为稳健标签单核苷酸多态性),即使某些单核苷酸多态性缺失,该子集仍能区分所有不同的单倍型。寻找最小稳健标签单核苷酸多态性的问题被证明是NP难的。为了有效地找到稳健标签单核苷酸多态性,我们提出了两种贪心算法和一种线性规划松弛算法。实验结果表明:(1)这些算法找到的解决方案非常接近最优解;(2)使用标签单核苷酸多态性节省的基因分型成本可高达80%;(3)为容忍缺失数据而对额外的标签单核苷酸多态性进行基因分型仍然具有成本效益。

结论

如果我们无法避免缺失数据的出现,对稳健标签单核苷酸多态性进行基因分型比仅对最小标签单核苷酸多态性进行基因分型更具实用性。我们的理论分析和实验结果表明,我们算法的性能不仅高效,而且找到的解决方案也接近最优解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/e5806c0159fd/1471-2105-6-263-11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/cec75481e1f2/1471-2105-6-263-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/ac1bfd6e2870/1471-2105-6-263-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/f018c1ef8151/1471-2105-6-263-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/12b5087d9bb4/1471-2105-6-263-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/f12bb6648edb/1471-2105-6-263-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/02b3e4a6c04f/1471-2105-6-263-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/7dd323485be5/1471-2105-6-263-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/0e52b50c1d20/1471-2105-6-263-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/10dc8823f527/1471-2105-6-263-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/dad2b1d933c6/1471-2105-6-263-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/e5806c0159fd/1471-2105-6-263-11.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/cec75481e1f2/1471-2105-6-263-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/ac1bfd6e2870/1471-2105-6-263-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/f018c1ef8151/1471-2105-6-263-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/12b5087d9bb4/1471-2105-6-263-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/f12bb6648edb/1471-2105-6-263-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/02b3e4a6c04f/1471-2105-6-263-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/7dd323485be5/1471-2105-6-263-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/0e52b50c1d20/1471-2105-6-263-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/10dc8823f527/1471-2105-6-263-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/dad2b1d933c6/1471-2105-6-263-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9b5e/1316880/e5806c0159fd/1471-2105-6-263-11.jpg

相似文献

1
Selecting additional tag SNPs for tolerating missing data in genotyping.选择额外的标签单核苷酸多态性以耐受基因分型中的缺失数据。
BMC Bioinformatics. 2005 Nov 1;6:263. doi: 10.1186/1471-2105-6-263.
2
Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies.利用基因型数据进行单倍型块划分和标签单核苷酸多态性选择及其在关联研究中的应用。
Genome Res. 2004 May;14(5):908-16. doi: 10.1101/gr.1837404. Epub 2004 Apr 12.
3
Multi-objective tag SNPs selection using evolutionary algorithms.基于进化算法的多目标标签 SNP 选择。
Bioinformatics. 2010 Jun 1;26(11):1446-52. doi: 10.1093/bioinformatics/btq158. Epub 2010 Apr 12.
4
Multi-marker-LD based genetic algorithm for tag SNP selection.基于多标记连锁不平衡的标签单核苷酸多态性选择遗传算法
Interdiscip Sci. 2014 Dec;6(4):303-11. doi: 10.1007/s12539-012-0060-x. Epub 2014 Aug 9.
5
Inference of missing SNPs and information quantity measurements for haplotype blocks.单倍型块中缺失单核苷酸多态性的推断及信息量测量
Bioinformatics. 2005 May 1;21(9):2001-7. doi: 10.1093/bioinformatics/bti261. Epub 2005 Feb 4.
6
A greedier approach for finding tag SNPs.一种寻找标签单核苷酸多态性(tag SNPs)的更贪婪的方法。
Bioinformatics. 2006 Mar 15;22(6):685-91. doi: 10.1093/bioinformatics/btk035. Epub 2006 Jan 10.
7
The impact of missing and erroneous genotypes on tagging SNP selection and power of subsequent association tests.缺失和错误基因型对标签单核苷酸多态性选择及后续关联检验效能的影响。
Hum Hered. 2006;61(1):31-44. doi: 10.1159/000092141. Epub 2006 Mar 23.
8
Tag SNP selection in genotype data for maximizing SNP prediction accuracy.在基因型数据中选择标签单核苷酸多态性以最大化单核苷酸多态性预测准确性。
Bioinformatics. 2005 Jun;21 Suppl 1:i195-203. doi: 10.1093/bioinformatics/bti1021.
9
Haplotype block structure and its applications to association studies: power and study designs.单倍型块结构及其在关联研究中的应用:效能与研究设计
Am J Hum Genet. 2002 Dec;71(6):1386-94. doi: 10.1086/344780. Epub 2002 Nov 18.
10
htSNPer1.0: software for haplotype block partition and htSNPs selection.htSNPer1.0:用于单倍型块划分和htSNP选择的软件。
BMC Bioinformatics. 2005 Mar 1;6:38. doi: 10.1186/1471-2105-6-38.

引用本文的文献

1
Assessing effectiveness of many-objective evolutionary algorithms for selection of tag SNPs.评估多目标进化算法在标签 SNP 选择中的有效性。
PLoS One. 2022 Dec 8;17(12):e0278560. doi: 10.1371/journal.pone.0278560. eCollection 2022.
2
Efficient haplotype block partitioning and tag SNP selection algorithms under various constraints.各种约束条件下的高效单倍型块划分及标签单核苷酸多态性选择算法。
Biomed Res Int. 2013;2013:984014. doi: 10.1155/2013/984014. Epub 2013 Nov 11.
3
Inference of chromosome-specific copy numbers using population haplotypes.

本文引用的文献

1
Whole-genome patterns of common DNA variation in three human populations.三个人类群体中常见DNA变异的全基因组模式。
Science. 2005 Feb 18;307(5712):1072-9. doi: 10.1126/science.1105436.
2
Optimal haplotype block-free selection of tagging SNPs for genome-wide association studies.用于全基因组关联研究的标签单核苷酸多态性的最优无单倍型块选择
Genome Res. 2004 Aug;14(8):1633-40. doi: 10.1101/gr.2570004.
3
Haplotype block partitioning and tag SNP selection using genotype data and their applications to association studies.利用基因型数据进行单倍型块划分和标签单核苷酸多态性选择及其在关联研究中的应用。
基于群体单体型推断染色体特异性拷贝数。
BMC Bioinformatics. 2011 May 24;12:194. doi: 10.1186/1471-2105-12-194.
4
Snagger: a user-friendly program for incorporating additional information for tagSNP selection.Snagger:一个用于为标签单核苷酸多态性选择整合额外信息的用户友好型程序。
BMC Bioinformatics. 2008 Mar 27;9:174. doi: 10.1186/1471-2105-9-174.
Genome Res. 2004 May;14(5):908-16. doi: 10.1101/gr.1837404. Epub 2004 Apr 12.
4
Haplotype reconstruction from genotype data using Imperfect Phylogeny.利用不完美系统发育从基因型数据中进行单倍型重建。
Bioinformatics. 2004 Aug 12;20(12):1842-9. doi: 10.1093/bioinformatics/bth149. Epub 2004 Feb 26.
5
Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium.利用连锁不平衡选择用于关联分析的信息量最大的单核苷酸多态性集合。
Am J Hum Genet. 2004 Jan;74(1):106-20. doi: 10.1086/381000. Epub 2003 Dec 15.
6
A comparison of bayesian methods for haplotype reconstruction from population genotype data.基于群体基因型数据的单倍型重建贝叶斯方法比较。
Am J Hum Genet. 2003 Nov;73(5):1162-9. doi: 10.1086/379378. Epub 2003 Oct 20.
7
Haplotype inference by maximum parsimony.通过最大简约法进行单倍型推断。
Bioinformatics. 2003 Sep 22;19(14):1773-80. doi: 10.1093/bioinformatics/btg239.
8
Haplotype block partition with limited resources and applications to human chromosome 21 haplotype data.资源有限情况下的单倍型块划分及其在人类21号染色体单倍型数据中的应用。
Am J Hum Genet. 2003 Jul;73(1):63-73. doi: 10.1086/376437. Epub 2003 Jun 10.
9
Efficiency of single-nucleotide polymorphism haplotype estimation from pooled DNA.从混合DNA中估计单核苷酸多态性单倍型的效率。
Proc Natl Acad Sci U S A. 2003 Jun 10;100(12):7225-30. doi: 10.1073/pnas.1237858100. Epub 2003 May 30.
10
GENECOUNTING: haplotype analysis with missing genotypes.基因计数:对缺失基因型进行单倍型分析。
Bioinformatics. 2002 Dec;18(12):1694-5. doi: 10.1093/bioinformatics/18.12.1694.