Suppr超能文献

一种基于加权单核苷酸多态性片段和含误差基因型的个体单倍型分型问题的更高精度模型。

A model of higher accuracy for the individual haplotyping problem based on weighted SNP fragments and genotype with errors.

作者信息

Xie Minzhu, Wang Jianxin, Chen Jianer

机构信息

School of Information Science and Engineering, Central South University, Changsha 410083, China.

出版信息

Bioinformatics. 2008 Jul 1;24(13):i105-13. doi: 10.1093/bioinformatics/btn147.

Abstract

MOTIVATION

In genetic studies of complex diseases, haplotypes provide more information than genotypes. However, haplotyping is much more difficult than genotyping using biological techniques. Therefore effective computational techniques have been in demand. The individual haplotyping problem is the computational problem of inducing a pair of haplotypes from an individual's aligned SNP fragments. Based on various optimal criteria and including different extra information, many models for the problem have been proposed. Higher accuracy of the models has been an important issue in the study of haplotype reconstruction.

RESULTS

The current article proposes a highly accurate model for the single individual haplotyping problem based on weighted fragments and genotypes with errors. The model is proved to be NP-hard even with gapless fragments. Based on the characteristics of Single Nucleotide Polymorphism (SNP) fragments, a parameterized algorithm of time complexity O(nk(2)2(k(2)) + m log m + mk(1)) is developed, where m is the number of fragments, n is the number of SNP sites, k(1) is the maximum number of SNP sites that a fragment covers (no more than n and usually smaller than 10) and k(2) is the maximum number of the fragments covering a SNP site (usually no more than 19). Extensive experiments show that this model is more accurate in haplotype reconstruction than other models.

AVAILABILITY

The program of the parameterized algorithm can be obtained by sending an email to the corresponding author.

摘要

动机

在复杂疾病的基因研究中,单倍型比基因型提供的信息更多。然而,使用生物技术进行单倍型分型比基因分型要困难得多。因此,人们一直需要有效的计算技术。个体单倍型分型问题是从个体的对齐单核苷酸多态性(SNP)片段中推断出一对单倍型的计算问题。基于各种最优标准并包含不同的额外信息,已经提出了许多针对该问题的模型。模型的更高准确性一直是单倍型重建研究中的一个重要问题。

结果

本文基于带误差的加权片段和基因型,为单一个体单倍型分型问题提出了一个高精度模型。即使对于无间隙片段,该模型也被证明是NP难的。基于单核苷酸多态性(SNP)片段的特征,开发了一种时间复杂度为O(nk(2)2(k(2)) + m log m + mk(1))的参数化算法,其中m是片段数量,n是SNP位点数量,k(1)是一个片段覆盖的SNP位点的最大数量(不超过n且通常小于10),k(2)是覆盖一个SNP位点的片段的最大数量(通常不超过19)。大量实验表明,该模型在单倍型重建方面比其他模型更准确。

可用性

通过向通讯作者发送电子邮件可以获得参数化算法的程序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1d2b/2718625/a0e204086c07/btn147f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验