Klukowski Piotr, Walczak Michal J, Gonczarek Adam, Boudet Julien, Wider Gerhard
Department of Computer Science, Wroclaw University of Technology, Wroclaw, Poland and.
Institute of Molecular Biology and Biophysics, ETH Zurich, 8093 Zurich, Switzerland.
Bioinformatics. 2015 Sep 15;31(18):2981-8. doi: 10.1093/bioinformatics/btv318. Epub 2015 May 20.
A detailed analysis of multidimensional NMR spectra of macromolecules requires the identification of individual resonances (peaks). This task can be tedious and time-consuming and often requires support by experienced users. Automated peak picking algorithms were introduced more than 25 years ago, but there are still major deficiencies/flaws that often prevent complete and error free peak picking of biological macromolecule spectra. The major challenges of automated peak picking algorithms is both the distinction of artifacts from real peaks particularly from those with irregular shapes and also picking peaks in spectral regions with overlapping resonances which are very hard to resolve by existing computer algorithms. In both of these cases a visual inspection approach could be more effective than a 'blind' algorithm.
We present a novel approach using computer vision (CV) methodology which could be better adapted to the problem of peak recognition. After suitable 'training' we successfully applied the CV algorithm to spectra of medium-sized soluble proteins up to molecular weights of 26 kDa and to a 130 kDa complex of a tetrameric membrane protein in detergent micelles. Our CV approach outperforms commonly used programs. With suitable training datasets the application of the presented method can be extended to automated peak picking in multidimensional spectra of nucleic acids or carbohydrates and adapted to solid-state NMR spectra.
CV-Peak Picker is available upon request from the authors.
gsw@mol.biol.ethz.ch; michal.walczak@mol.biol.ethz.ch; adam.gonczarek@pwr.edu.pl
Supplementary data are available at Bioinformatics online.
对大分子的多维核磁共振谱进行详细分析需要识别各个共振峰(信号峰)。这项任务可能既繁琐又耗时,而且通常需要有经验的用户提供支持。自动峰挑选算法在25年多以前就已引入,但仍然存在一些重大缺陷,常常妨碍对生物大分子谱进行完整且无错误的峰挑选。自动峰挑选算法面临的主要挑战在于,如何区分伪峰与真实峰,尤其是那些形状不规则的真实峰,以及如何在共振峰重叠的光谱区域挑选峰,而现有的计算机算法很难解析这些重叠峰。在这两种情况下,目视检查方法可能比“盲目”算法更有效。
我们提出了一种使用计算机视觉(CV)方法的新颖途径,该方法可能更适合峰识别问题。经过适当的“训练”后,我们成功地将CV算法应用于分子量高达26 kDa的中等大小可溶性蛋白质的光谱,以及去污剂胶束中一种130 kDa的四聚体膜蛋白复合物的光谱。我们的CV方法优于常用程序。通过合适的训练数据集,所提出方法的应用可以扩展到核酸或碳水化合物多维光谱的自动峰挑选,并适用于固态核磁共振谱。
可向作者索取CV峰挑选器。
gsw@mol.biol.ethz.ch;michal.walczak@mol.biol.ethz.ch;adam.gonczarek@pwr.edu.pl
补充数据可在《生物信息学》在线获取。