电压门控钾通道电压敏感性调控残基的计算识别
Computational identification of residues that modulate voltage sensitivity of voltage-gated potassium channels.
作者信息
Li Bin, Gallin Warren J
机构信息
Department of Biological Sciences, University of Alberta, Edmonton, T6G 2E9, Canada.
出版信息
BMC Struct Biol. 2005 Aug 19;5:16. doi: 10.1186/1472-6807-5-16.
BACKGROUND
Studies of the structure-function relationship in proteins for which no 3D structure is available are often based on inspection of multiple sequence alignments. Many functionally important residues of proteins can be identified because they are conserved during evolution. However, residues that vary can also be critically important if their variation is responsible for diversity of protein function and improved phenotypes. If too few sequences are studied, the support for hypotheses on the role of a given residue will be weak, but analysis of large multiple alignments is too complex for simple inspection. When a large body of sequence and functional data are available for a protein family, mature data mining tools, such as machine learning, can be applied to extract information more easily, sensitively and reliably. We have undertaken such an analysis of voltage-gated potassium channels, a transmembrane protein family whose members play indispensable roles in electrically excitable cells.
RESULTS
We applied different learning algorithms, combined in various implementations, to obtain a model that predicts the half activation voltage of a voltage-gated potassium channel based on its amino acid sequence. The best result was obtained with a k-nearest neighbor classifier combined with a wrapper algorithm for feature selection, producing a mean absolute error of prediction of 7.0 mV. The predictor was validated by permutation test and evaluation of independent experimental data. Feature selection identified a number of residues that are predicted to be involved in the voltage sensitive conformation changes; these residues are good target candidates for mutagenesis analysis.
CONCLUSION
Machine learning analysis can identify new testable hypotheses about the structure/function relationship in the voltage-gated potassium channel family. This approach should be applicable to any protein family if the number of training examples and the sequence diversity of the training set that are necessary for robust prediction are empirically validated. The predictor and datasets can be found at the VKCDB web site.
背景
对于没有三维结构的蛋白质,其结构 - 功能关系的研究通常基于对多序列比对的检查。蛋白质的许多功能重要残基可以被识别出来,因为它们在进化过程中是保守的。然而,如果残基的变异导致了蛋白质功能的多样性和更好的表型,那么这些变异的残基也可能至关重要。如果研究的序列太少,对给定残基作用的假设的支持就会很弱,但对大型多序列比对的分析对于简单检查来说过于复杂。当有大量的蛋白质家族序列和功能数据可用时,可以应用成熟的数据挖掘工具,如机器学习,来更轻松、灵敏和可靠地提取信息。我们对电压门控钾通道进行了这样的分析,电压门控钾通道是一个跨膜蛋白家族,其成员在电可兴奋细胞中发挥着不可或缺的作用。
结果
我们应用了不同的学习算法,并以各种组合方式实现,以获得一个基于电压门控钾通道的氨基酸序列预测其半激活电压的模型。使用k近邻分类器与用于特征选择的包装算法相结合,得到了最佳结果,预测的平均绝对误差为7.0 mV。通过置换检验和独立实验数据评估对预测器进行了验证。特征选择确定了一些预计参与电压敏感构象变化的残基;这些残基是诱变分析的良好候选靶点。
结论
机器学习分析可以识别关于电压门控钾通道家族结构/功能关系的新的可测试假设。如果稳健预测所需的训练示例数量和训练集的序列多样性经过实证验证,这种方法应该适用于任何蛋白质家族。预测器和数据集可在VKCDB网站上找到。
相似文献
BMC Bioinformatics. 2004-1-9
J Chem Inf Model. 2007
Prog Biophys Mol Biol. 2006-10
Novartis Found Symp. 2002
引用本文的文献
Circ Res. 2021-2-19
Acta Pharmacol Sin. 2013-11-18
本文引用的文献
Curr Biol. 2005-1-26
Bioinformatics. 2005-4-1
BMC Bioinformatics. 2004-8-19
J Neurobiol. 2004-8
FEBS Lett. 2004-4-30
Clin Cancer Res. 2004-4-15
BMC Bioinformatics. 2004-1-9