IEEE/ACM Trans Comput Biol Bioinform. 2017 Sep-Oct;14(5):1195-1201. doi: 10.1109/TCBB.2016.2564964. Epub 2016 May 9.
In peptite mass fingerprinting, an unknown protein is fragmented into smaller peptides whose masses are accurately measured; the obtained list of weights is then compared with a reference database to obtain a set of matching proteins. The exponential growth of known proteins discourage the use of brute force methods, where the weights' list is compared with each protein in the reference collection; luckily, the scientific literature in the database field highlights that well designed searching algorithms, coupled with a proper data organization, allow to quickly solve the identification problem even on standard desktop computers. In this paper, IsAProteinsDB, an indexed database of trypsinized proteins, is presented. The corresponding search algorithm shows a time complexity that does not significantly depends on the size of the reference protein database.
在肽质量指纹分析中,将未知蛋白质分解成更小的肽段,准确测量其质量;然后将获得的分子量列表与参考数据库进行比较,以获得一组匹配的蛋白质。已知蛋白质的指数级增长阻碍了使用暴力方法,即将列表与参考集合中的每个蛋白质进行比较;幸运的是,数据库领域的科学文献强调,精心设计的搜索算法,加上适当的数据组织,可以在标准台式计算机上快速解决识别问题。在本文中,介绍了索引数据库 IsAProteinsDB,该数据库是经胰蛋白酶处理的蛋白质。相应的搜索算法的时间复杂度与参考蛋白质数据库的大小没有明显的关系。