Suppr超能文献

MRMD3.0:一个基于集成策略的用于降维和数据可视化的 Python 工具和 Web 服务器。

MRMD3.0: A Python Tool and Webserver for Dimensionality Reduction and Data Visualization via an Ensemble Strategy.

机构信息

Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou, Zhejiang, China; Department of Computer Science, University of Tsukuba, Tsukuba, Ibaraki 305-8577, Japan.

Department of Computer Science, University of Tsukuba, Tsukuba, Ibaraki 305-8577, Japan.

出版信息

J Mol Biol. 2023 Jul 15;435(14):168116. doi: 10.1016/j.jmb.2023.168116. Epub 2023 Apr 21.

Abstract

Dimensionality reduction is a hot topic in machine learning that can help researchers find key features from complex medical or biological data, which is crucial for biological sequence research, drug development, etc. However, when applied to specific datasets, different dimensionality reduction methods generate different results, which produces instability and makes tuning the parameters a time-consuming task. Exploring high quality features, genes, or attributes from complex data is an important task and challenge. To ensure the efficiency, robustness, and accuracy of experiments, in this work, we developed a dimensionality reduction tool MRMD3.0 based on the ensemble strategy of link analysis. It is mainly divided into two steps: first, the ensemble method is used to integrate different feature ranking algorithms to calculate feature importance, and then the forward feature search strategy combined with cross-validation is used to explore the proper feature combination. Compared with the previously developed version, MRMD3.0 has added more link-based ensemble algorithms, including PageRank, HITS, LeaderRank, and TrustRank. At the same time, more feature ranking algorithms have been added, and their effect and calculation speed have been greatly improved. In addition, the newest version provides an interface used by each feature ranking method and five kinds of charts to help users analyze features. Finally, we also provide an online webserver to help researchers analyze the data. Availability and implementation Webserver: http://lab.malab.cn/soft/MRMDv3/home.html. GitHub: https://github.com/heshida01/MRMD3.0.

摘要

降维是机器学习中的一个热门话题,它可以帮助研究人员从复杂的医学或生物学数据中找到关键特征,这对于生物序列研究、药物开发等至关重要。然而,当应用于特定数据集时,不同的降维方法会产生不同的结果,这会产生不稳定性,并且调整参数是一项耗时的任务。从复杂数据中探索高质量的特征、基因或属性是一项重要的任务和挑战。为了确保实验的效率、鲁棒性和准确性,在这项工作中,我们基于链接分析的集成策略开发了一个名为 MRMD3.0 的降维工具。它主要分为两个步骤:首先,集成方法用于整合不同的特征排序算法来计算特征重要性,然后使用结合交叉验证的前向特征搜索策略来探索合适的特征组合。与之前开发的版本相比,MRMD3.0 增加了更多基于链接的集成算法,包括 PageRank、HITS、LeaderRank 和 TrustRank。同时,还添加了更多的特征排序算法,并大大提高了它们的效果和计算速度。此外,最新版本提供了每个特征排序方法的接口和五种图表,以帮助用户分析特征。最后,我们还提供了一个在线网络服务器,以帮助研究人员分析数据。

可用性和实施

Web 服务器:http://lab.malab.cn/soft/MRMDv3/home.html。

GitHub:https://github.com/heshida01/MRMD3.0。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验