MOE Key Laboratory of Bioinformatics and Bioinformatics Division, TNLIST/Department of Automation, Tsinghua University, Beijing 100084, People's Republic of China.
IET Syst Biol. 2014 Apr;8(2):33-40. doi: 10.1049/iet-syb.2013.0033.
Detecting associations between human genetic variants and their phenotypic effects is a significant problem in understanding genetic bases of human-inherited diseases. The focus is on a typical type of genetic variants called non-synonymous single nucleotide polymorphisms (nsSNPs), whose occurrence may potentially alter the structures of proteins, affecting functions of proteins, and thereby causing diseases. Most of the existing methods predict associations between nsSNPs and diseases based on features derived from only protein sequence and/or structure information, and give no information about which specific disease an nsSNP is associated with. To cope with these problems, the identification of nsSNPs that are associated with a specific disease from a set of candidate nsSNPs as a binary classification problem has been formulated. A new approach has been adopted for predicting associations between nsSNPs and diseases based on multiple nsSNP similarity networks and disease phenotype similarity networks. With a series of comprehensive validation experiments, it has been demonstrated that the proposed method is effective in both recovering the nsSNP-disease associations and inferring suspect disease-associated nsSNPs for both diseases with known genetic bases and diseases of unknown genetic bases.
检测人类遗传变异与其表型效应之间的关联是理解人类遗传性疾病遗传基础的一个重要问题。研究的重点是一种典型的遗传变异,称为非同义单核苷酸多态性(nsSNP),其发生可能潜在地改变蛋白质的结构,影响蛋白质的功能,从而导致疾病。现有的大多数方法都是基于仅从蛋白质序列和/或结构信息中提取的特征来预测 nsSNP 与疾病之间的关联,而不能提供与特定疾病相关的信息。为了应对这些问题,已经将从一组候选 nsSNP 中识别与特定疾病相关的 nsSNP 作为二进制分类问题进行了公式化。本文提出了一种基于多个 nsSNP 相似性网络和疾病表型相似性网络来预测 nsSNP 与疾病之间关联的新方法。通过一系列全面的验证实验,证明了该方法对于恢复 nsSNP-疾病关联以及推断具有已知遗传基础的疾病和未知遗传基础的疾病的可疑疾病相关 nsSNP 都是有效的。