Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, 200240, China; Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai, 200240, China.
Proteins. 2015 Mar;83(3):485-96. doi: 10.1002/prot.24744. Epub 2015 Jan 24.
Residue contact map is essential for protein three-dimensional structure determination. But most of the current contact prediction methods based on residue co-evolution suffer from high false-positives as introduced by indirect and transitive contacts (i.e., residues A-B and B-C are in contact, but A-C are not). Built on the work by Feizi et al. (Nat Biotechnol 2013; 31:726-733), which demonstrated a general network model to distinguish direct dependencies by network deconvolution, this study presents a new balanced network deconvolution (BND) algorithm to identify optimized dependency matrix without limit on the eigenvalue range in the applied network systems. The algorithm was used to filter contact predictions of five widely used co-evolution methods. On the test of proteins from three benchmark datasets of the 9th critical assessment of protein structure prediction (CASP9), CASP10, and PSICOV (precise structural contact prediction using sparse inverse covariance estimation) database experiments, the BND can improve the medium- and long-range contact predictions at the L/5 cutoff by 55.59% and 47.68%, respectively, without additional central processing unit cost. The improvement is statistically significant, with a P-value < 5.93 × 10(-3) in the Student's t-test. A further comparison with the ab initio structure predictions in CASPs showed that the usefulness of the current co-evolution-based contact prediction to the three-dimensional structure modeling relies on the number of homologous sequences existing in the sequence databases. BND can be used as a general contact refinement method, which is freely available at: http://www.csbio.sjtu.edu.cn/bioinf/BND/.
残基接触图谱对于蛋白质三维结构的测定至关重要。但目前大多数基于残基共进化的接触预测方法都存在较高的假阳性率,这是由间接和传递接触(即残基 A-B 和 B-C 有接触,但 A-C 没有接触)所导致的。本研究基于 Feizi 等人的工作(Nat Biotechnol 2013; 31:726-733),该工作提出了一种通用的网络去卷积模型,用于区分直接依赖关系,在此基础上提出了一种新的平衡网络去卷积(BND)算法,用于在应用网络系统中没有特征值范围限制的情况下识别优化的依赖矩阵。该算法用于过滤五种广泛使用的共进化方法的接触预测。在对来自第九届蛋白质结构预测关键评估(CASP9)、CASP10 和 PSICOV(使用稀疏逆协方差估计进行精确结构接触预测)数据库实验的三个基准数据集的蛋白质进行测试时,BND 可以分别将 L/5 截止处的中程和远程接触预测提高 55.59%和 47.68%,而不会增加中央处理器成本。这种改进具有统计学意义,在学生 t 检验中 P 值<5.93×10(-3)。与 CASP 中的从头预测结构的进一步比较表明,当前基于共进化的接触预测对三维结构建模的有用性取决于序列数据库中存在的同源序列数量。BND 可以用作通用的接触精修方法,可在以下网址免费获得:http://www.csbio.sjtu.edu.cn/bioinf/BND/。