Zhao Wei, Wang Likun, Zhang Tian-Xiang, Zhao Ze-Ning, Du Pu-Feng
School of Computer Science and Technology, Tianjin University, Tianjin 300350, China.
Beijing Key Laboratory of Tumor Systems Biology, Department of Pathology, School of Basic Medical Sciences, Institute of Systems Biomedicine, Peking University Health Science Center, Beijing 100191, China.
Protein Pept Lett. 2018;25(9):822-829. doi: 10.2174/0929866525666180905111124.
In the post-genome age, it is more urgent to understand the functions of genes and proteins. Since experimental methods are usually costly and time consuming, computational predictions are recognized as an alternative approach. In developing a predictive method for functional genomics and proteomics, one of the most important steps is to represent biological sequences with a fixed length numerical form, which can be further analyzed using machine learning algorithms. Chou's pseudo-amino acid compositions and the pseudo k-nucleotide compositions are algorithms for this purpose.
Since the appearance of these algorithms, several software tools have been developed as implementations. These software tools facilitate the application of these algorithms. As these software tools are developed with different technologies and for different application scenarios, we will briefly review the technical aspect of these software tools in this short review.
在后基因组时代,了解基因和蛋白质的功能变得更加迫切。由于实验方法通常成本高昂且耗时,计算预测被视为一种替代方法。在开发功能基因组学和蛋白质组学的预测方法时,最重要的步骤之一是以固定长度的数字形式表示生物序列,以便可以使用机器学习算法进行进一步分析。周氏伪氨基酸组成和伪k核苷酸组成就是用于此目的的算法。
自从这些算法出现以来,已经开发了几种软件工具作为实现方式。这些软件工具促进了这些算法的应用。由于这些软件工具是使用不同技术并针对不同应用场景开发的,我们将在这篇简短的综述中简要回顾这些软件工具的技术方面。