• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

鉴定无已知家系的密切相关个体的突变区域。

Identifying mutation regions for closely related individuals without a known pedigree.

机构信息

Department of Computer Science, City University of Hong Kong, Kowloon, Hong Kong.

出版信息

BMC Bioinformatics. 2012 Jun 25;13:146. doi: 10.1186/1471-2105-13-146.

DOI:10.1186/1471-2105-13-146
PMID:22731852
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3507658/
Abstract

BACKGROUND

Linkage analysis is the first step in the search for a disease gene. Linkage studies have facilitated the identification of several hundred human genes that can harbor mutations leading to a disease phenotype. In this paper, we study a very important case, where the sampled individuals are closely related, but the pedigree is not given. This situation happens very often when the individuals share a common ancestor 6 or more generations ago. To our knowledge, no algorithm can give good results for this case.

RESULTS

To solve this problem, we first developed some heuristic algorithms for haplotype inference without any given pedigree. We propose a model using the parsimony principle that can be viewed as an extension of the model first proposed by Dan Gusfield. Our heuristic algorithm uses Clark's inference rule to infer haplotype segments.

CONCLUSIONS

We ran our program both on the simulated data and a set of real data from the phase II HapMap database. Experiments show that our program performs well. The recall value is from 90% to 99% in various cases. This implies that the program can report more than 90% of the true mutation regions. The value of precision varies from 29% to 90%. When the precision is 29%, the size of the reported regions is three times that of the true mutation region. This is still very useful for narrowing down the range of the disease gene location. Our program can complete the computation for all the tested cases, where there are about 110,000 SNPs on a chromosome, within 20 seconds.

摘要

背景

连锁分析是寻找疾病基因的第一步。连锁研究已经促成了数百个人类基因的鉴定,这些基因可能携带有导致疾病表型的突变。在本文中,我们研究了一个非常重要的案例,其中采样个体之间存在密切关系,但没有给出系谱。当个体具有 6 代或 6 代以上的共同祖先时,这种情况经常发生。据我们所知,对于这种情况,没有算法可以给出很好的结果。

结果

为了解决这个问题,我们首先开发了一些没有任何给定系谱的单体型推断启发式算法。我们提出了一个使用简约原则的模型,可以看作是 Dan Gusfield 首次提出的模型的扩展。我们的启发式算法使用 Clark 的推断规则来推断单体型片段。

结论

我们在模拟数据和来自第二阶段 HapMap 数据库的一组真实数据上运行了我们的程序。实验表明,我们的程序表现良好。在各种情况下,召回值在 90%到 99%之间。这意味着程序可以报告超过 90%的真实突变区域。精度值在 29%到 90%之间变化。当精度为 29%时,报告区域的大小是真实突变区域的三倍。这对于缩小疾病基因位置的范围仍然非常有用。我们的程序可以在 20 秒内完成所有测试案例的计算,其中一个染色体上大约有 110,000 个 SNPs。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/81517f5f7e02/1471-2105-13-146-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/f23f98e385b0/1471-2105-13-146-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/9f115752e589/1471-2105-13-146-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/9dffca2a7f1c/1471-2105-13-146-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/4335c607ddbb/1471-2105-13-146-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/3b3d6d9ec94d/1471-2105-13-146-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/7ad1023b9233/1471-2105-13-146-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/81517f5f7e02/1471-2105-13-146-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/f23f98e385b0/1471-2105-13-146-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/9f115752e589/1471-2105-13-146-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/9dffca2a7f1c/1471-2105-13-146-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/4335c607ddbb/1471-2105-13-146-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/3b3d6d9ec94d/1471-2105-13-146-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/7ad1023b9233/1471-2105-13-146-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9c36/3507658/81517f5f7e02/1471-2105-13-146-7.jpg

相似文献

1
Identifying mutation regions for closely related individuals without a known pedigree.鉴定无已知家系的密切相关个体的突变区域。
BMC Bioinformatics. 2012 Jun 25;13:146. doi: 10.1186/1471-2105-13-146.
2
Mutation region detection for closely related individuals without a known pedigree using high-density genotype data.利用高密度基因型数据检测无已知家系的近亲个体的突变区域。
IEEE/ACM Trans Comput Biol Bioinform. 2012;9(2):499-510. doi: 10.1109/TCBB.2011.134. Epub 2011 Oct 17.
3
CollHaps: a heuristic approach to haplotype inference by parsimony.CollHaps:一种基于简约法的单倍型推断启发式方法。
IEEE/ACM Trans Comput Biol Bioinform. 2010 Jul-Sep;7(3):511-23. doi: 10.1109/TCBB.2008.130.
4
HAPLORE: a program for haplotype reconstruction in general pedigrees without recombination.HAPLORE:一个用于在无重组的一般家系中进行单倍型重建的程序。
Bioinformatics. 2005 Jan 1;21(1):90-103. doi: 10.1093/bioinformatics/bth388. Epub 2004 Jul 1.
5
Linked region detection using high-density SNP genotype data via the minimum recombinant model of pedigree haplotype inference.通过家系单倍型推断的最小重组模型,利用高密度SNP基因型数据进行连锁区域检测。
BMC Bioinformatics. 2009 Jul 15;10:216. doi: 10.1186/1471-2105-10-216.
6
Ancestral haplotype reconstruction in endogamous populations using identity-by-descent.利用同源单亲二倍体进行同宗人群的祖先单体型重建。
PLoS Comput Biol. 2021 Feb 26;17(2):e1008638. doi: 10.1371/journal.pcbi.1008638. eCollection 2021 Feb.
7
Efficient inference of haplotypes from genotypes on a pedigree.从系谱中的基因型高效推断单倍型。
J Bioinform Comput Biol. 2003 Apr;1(1):41-69. doi: 10.1142/s0219720003000204.
8
Inferring haplotypes from genotypes on a pedigree with mutations, genotyping errors and missing alleles.在存在突变、基因分型错误和等位基因缺失的家系中从基因型推断单倍型。
J Bioinform Comput Biol. 2011 Apr;9(2):339-65. doi: 10.1142/s0219720011005549.
9
An efficient algorithm for haplotype inference on pedigrees with recombinations and mutations.一种用于存在重组和突变的家系中单体型推断的高效算法。
IEEE/ACM Trans Comput Biol Bioinform. 2012 Jan-Feb;9(1):12-25. doi: 10.1109/TCBB.2011.51. Epub 2011 Mar 3.
10
ISHAPE: new rapid and accurate software for haplotyping.ISHAPE:用于单倍型分型的新型快速准确软件。
BMC Bioinformatics. 2007 Jun 15;8:205. doi: 10.1186/1471-2105-8-205.

本文引用的文献

1
A fast, powerful method for detecting identity by descent.一种快速、强大的通过血缘关系进行身份检测的方法。
Am J Hum Genet. 2011 Feb 11;88(2):173-82. doi: 10.1016/j.ajhg.2011.01.010.
2
High-resolution detection of identity by descent in unrelated individuals.高分辨率检测无关个体间的血缘关系。
Am J Hum Genet. 2010 Apr 9;86(4):526-39. doi: 10.1016/j.ajhg.2010.02.021. Epub 2010 Mar 18.
3
Linked region detection using high-density SNP genotype data via the minimum recombinant model of pedigree haplotype inference.通过家系单倍型推断的最小重组模型,利用高密度SNP基因型数据进行连锁区域检测。
BMC Bioinformatics. 2009 Jul 15;10:216. doi: 10.1186/1471-2105-10-216.
4
Most parsimonious haplotype allele sharing determination.最简约单倍型等位基因共享判定
BMC Bioinformatics. 2009 Apr 21;10:115. doi: 10.1186/1471-2105-10-115.
5
Detection of sharing by descent, long-range phasing and haplotype imputation.通过血缘共享、长程定相和单倍型填充进行检测。
Nat Genet. 2008 Sep;40(9):1068-75. doi: 10.1038/ng.216.
6
Predicting the number and sizes of IBD regions among family members and evaluating the family size requirement for linkage studies.预测家族成员中炎症性肠病(IBD)区域的数量和大小,并评估连锁研究所需的家族规模。
Eur J Hum Genet. 2008 Dec;16(12):1535-43. doi: 10.1038/ejhg.2008.116. Epub 2008 Jun 25.
7
Identification of linked regions using high-density SNP genotype data in linkage analysis.在连锁分析中使用高密度单核苷酸多态性(SNP)基因型数据鉴定连锁区域。
Bioinformatics. 2008 Jan 1;24(1):86-93. doi: 10.1093/bioinformatics/btm552. Epub 2007 Nov 17.
8
PLINK: a tool set for whole-genome association and population-based linkage analyses.PLINK:一个用于全基因组关联分析和基于群体的连锁分析的工具集。
Am J Hum Genet. 2007 Sep;81(3):559-75. doi: 10.1086/519795. Epub 2007 Jul 25.
9
Integer programming approaches to haplotype inference by pure parsimony.通过纯简约法进行单倍型推断的整数规划方法。
IEEE/ACM Trans Comput Biol Bioinform. 2006 Apr-Jun;3(2):141-54. doi: 10.1109/TCBB.2006.24.
10
A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase.一种用于大规模群体基因型数据的快速灵活统计模型:在推断缺失基因型和单倍型相位中的应用。
Am J Hum Genet. 2006 Apr;78(4):629-44. doi: 10.1086/502802. Epub 2006 Feb 17.