Meng Chaolu, Hou Yongqi, Zou Quan, Shi Lei, Su Xi, Ju Ying
College of Computer and Information Engineering, Inner Mongolia Agricultural University, Hohhot, China.
Inner Mongolia Autonomous Region Key Laboratory of Big Data Research and Application of Agriculture and Animal Husbandry, Hohhot, China.
Genomics Inform. 2024 Dec 4;22(1):29. doi: 10.1186/s44342-024-00026-z.
In protein identification, researchers increasingly aim to achieve efficient classification using fewer features. While many feature selection methods effectively reduce the number of model features, they often cause information loss caused by merely selecting or discarding features, which limits classifier performance. To address this issue, we present Rore, an algorithm based on a feature-dimensionality reduction strategy. By mapping the original features to a latent space, Rore retains all relevant feature information while using fewer representations of the latent features. This approach significantly preserves the original information and overcomes the information loss problem associated with previous feature selection. Through extensive experimental validation and analysis, Rore demonstrated excellent performance on an antioxidant protein dataset, achieving an accuracy of 95.88% and MCC of 91.78%, using vectors including only 15 features. The Rore algorithm is available online at http://112.124.26.17:8021/Rore .
在蛋白质鉴定中,研究人员越来越希望使用更少的特征来实现高效分类。虽然许多特征选择方法有效地减少了模型特征的数量,但它们往往会因仅仅选择或丢弃特征而导致信息丢失,这限制了分类器的性能。为了解决这个问题,我们提出了Rore,一种基于特征降维策略的算法。通过将原始特征映射到潜在空间,Rore在使用更少的潜在特征表示的同时保留了所有相关特征信息。这种方法显著地保留了原始信息,并克服了与先前特征选择相关的信息丢失问题。通过广泛的实验验证和分析,Rore在一个抗氧化蛋白数据集上表现出优异的性能,使用仅包含15个特征的向量,准确率达到95.88%,马修斯相关系数达到91.78%。Rore算法可在http://112.124.26.17:8021/Rore在线获取。