Institute of Medical Genetics, School of Medicine, Cardiff University, Cardiff, United Kingdom.
Hum Mutat. 2010 Mar;31(3):335-46. doi: 10.1002/humu.21192.
An important challenge in translational bioinformatics is to understand how genetic variation gives rise to molecular changes at the protein level that can precipitate both monogenic and complex disease. To this end, we compiled datasets of human disease-associated amino acid substitutions (AAS) in the contexts of inherited monogenic disease, complex disease, functional polymorphisms with no known disease association, and somatic mutations in cancer, and compared them with respect to predicted functional sites in proteins. Using the sequence homology-based tool SIFT to estimate the proportion of deleterious AAS in each dataset, only complex disease AAS were found to be indistinguishable from neutral polymorphic AAS. Investigation of monogenic disease AAS predicted to be nondeleterious by SIFT were characterized by a significant enrichment for inherited AAS within solvent accessible residues, regions of intrinsic protein disorder, and an association with the loss or gain of various posttranslational modifications. Sites of structural and/or functional interest were therefore surmised to constitute useful additional features with which to identify the molecular disruptions caused by deleterious AAS. A range of bioinformatic tools, designed to predict structural and functional sites in protein sequences, were then employed to demonstrate that intrinsic biases exist in terms of the distribution of different types of human AAS with respect to specific structural, functional and pathological features. Our Web tool, designed to potentiate the functional profiling of novel AAS, has been made available at http://profile.mutdb.org/.
转化生物信息学的一个重要挑战是了解遗传变异如何导致蛋白质水平的分子变化,从而引发单基因疾病和复杂疾病。为此,我们编译了人类疾病相关氨基酸取代(AAS)的数据集,这些数据集涉及遗传性单基因疾病、复杂疾病、无已知疾病关联的功能多态性以及癌症中的体细胞突变,并比较了它们在蛋白质中预测的功能位点。使用基于序列同源性的工具 SIFT 来估计每个数据集有害 AAS 的比例,发现只有复杂疾病 AAS 与中性多态性 AAS 无法区分。对 SIFT 预测为非有害的单基因疾病 AAS 的研究表明,在溶剂可及残基、内在蛋白无序区域内,遗传性 AAS 显著富集,并与各种翻译后修饰的丢失或获得有关。因此,推测结构和/或功能感兴趣的位点构成了有用的附加特征,可用于识别有害 AAS 引起的分子破坏。然后,使用了一系列旨在预测蛋白质序列结构和功能位点的生物信息学工具,证明了在特定结构、功能和病理特征方面,不同类型的人类 AAS 存在固有偏差。我们设计的用于增强新型 AAS 功能分析的 Web 工具可在 http://profile.mutdb.org/ 上获得。