iFeatureOmega：一个综合性平台，用于对分子序列、结构和配体数据集的特征进行工程设计、可视化和分析。

iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets.

机构信息

Collaborative Innovation Center of Henan Grain Crops, Henan Agricultural University, Zhengzhou 450046, China.

Center for Crop Genome Engineering, Henan Agricultural University, Zhengzhou 450046, China.

出版信息

Nucleic Acids Res. 2022 Jul 5;50(W1):W434-W447. doi: 10.1093/nar/gkac351.

DOI:10.1093/nar/gkac351

PMID:35524557

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9252729/

Abstract

The rapid accumulation of molecular data motivates development of innovative approaches to computationally characterize sequences, structures and functions of biological and chemical molecules in an efficient, accessible and accurate manner. Notwithstanding several computational tools that characterize protein or nucleic acids data, there are no one-stop computational toolkits that comprehensively characterize a wide range of biomolecules. We address this vital need by developing a holistic platform that generates features from sequence and structural data for a diverse collection of molecule types. Our freely available and easy-to-use iFeatureOmega platform generates, analyzes and visualizes 189 representations for biological sequences, structures and ligands. To the best of our knowledge, iFeatureOmega provides the largest scope when directly compared to the current solutions, in terms of the number of feature extraction and analysis approaches and coverage of different molecules. We release three versions of iFeatureOmega including a webserver, command line interface and graphical interface to satisfy needs of experienced bioinformaticians and less computer-savvy biologists and biochemists. With the assistance of iFeatureOmega, users can encode their molecular data into representations that facilitate construction of predictive models and analytical studies. We highlight benefits of iFeatureOmega based on three research applications, demonstrating how it can be used to accelerate and streamline research in bioinformatics, computational biology, and cheminformatics areas. The iFeatureOmega webserver is freely available at http://ifeatureomega.erc.monash.edu and the standalone versions can be downloaded from https://github.com/Superzchen/iFeatureOmega-GUI/ and https://github.com/Superzchen/iFeatureOmega-CLI/.

摘要

分子数据的快速积累促使人们开发创新方法，以高效、可及和准确的方式计算生物和化学分子的序列、结构和功能。尽管有一些用于描述蛋白质或核酸数据的计算工具，但没有一个综合性的计算工具包可以全面描述广泛的生物分子。我们通过开发一个全面的平台来解决这一关键需求，该平台从序列和结构数据中生成各种分子类型的特征。我们的免费且易于使用的 iFeatureOmega 平台为多种分子类型生成生物序列、结构和配体的 189 种表示形式。据我们所知，在直接比较当前解决方案时，iFeatureOmega 在特征提取和分析方法的数量以及不同分子的覆盖范围方面提供了最大的范围。我们发布了三个版本的 iFeatureOmega，包括一个网络服务器、命令行接口和图形界面，以满足经验丰富的生物信息学家和不太精通计算机的生物学家和生物化学家的需求。有了 iFeatureOmega 的帮助，用户可以将他们的分子数据编码为表示形式，从而方便构建预测模型和分析研究。我们基于三个研究应用突出了 iFeatureOmega 的优势，展示了如何将其用于加速和简化生物信息学、计算生物学和化学信息学领域的研究。iFeatureOmega 网络服务器可免费在 http://ifeatureomega.erc.monash.edu 上使用，独立版本可从 https://github.com/Superzchen/iFeatureOmega-GUI/ 和 https://github.com/Superzchen/iFeatureOmega-CLI/ 下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5c4b/9252729/6a3a1a38af8b/gkac351figgra1.jpg

相似文献

iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets.

Nucleic Acids Res. 2022 Jul 5;50(W1):W434-W447. doi: 10.1093/nar/gkac351.

iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization.

Nucleic Acids Res. 2021 Jun 4;49(10):e60. doi: 10.1093/nar/gkab122.

Leaf-GP: an open and automated software application for measuring growth phenotypes for arabidopsis and wheat.

Plant Methods. 2017 Dec 22;13:117. doi: 10.1186/s13007-017-0266-3. eCollection 2017.

Realtime analysis and visualization of MinION sequencing data with npReader.

Bioinformatics. 2016 Mar 1;32(5):764-6. doi: 10.1093/bioinformatics/btv658. Epub 2015 Nov 10.

TBtools: An Integrative Toolkit Developed for Interactive Analyses of Big Biological Data.

Mol Plant. 2020 Aug 3;13(8):1194-1202. doi: 10.1016/j.molp.2020.06.009. Epub 2020 Jun 23.

MBEToolbox: a MATLAB toolbox for sequence data analysis in molecular biology and evolution.

BMC Bioinformatics. 2005 Mar 22;6:64. doi: 10.1186/1471-2105-6-64.

iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences.

Bioinformatics. 2018 Jul 15;34(14):2499-2502. doi: 10.1093/bioinformatics/bty140.

Automated programming for bioinformatics algorithm deployment.

Bioinformatics. 2008 Feb 1;24(3):450-1. doi: 10.1093/bioinformatics/btm602. Epub 2008 Jan 3.

GUIDEMOL: A Python graphical user interface for molecular descriptors based on RDKit.

Mol Inform. 2024 Jan;43(1):e202300190. doi: 10.1002/minf.202300190. Epub 2023 Nov 20.

Eureka-DMA: an easy-to-operate graphical user interface for fast comprehensive investigation and analysis of DNA microarray data.

BMC Bioinformatics. 2014 Feb 24;15:53. doi: 10.1186/1471-2105-15-53.

引用本文的文献

Exo-Tox: Identifying Exotoxins from secreted bacterial proteins.

BioData Min. 2025 Aug 8;18(1):52. doi: 10.1186/s13040-025-00469-2.

siRNA Features-Automated Machine Learning of 3D Molecular Fingerprints and Structures for Therapeutic Off-Target Data.

Int J Mol Sci. 2025 Jul 16;26(14):6795. doi: 10.3390/ijms26146795.

PREDAC-FluB: predicting antigenic clusters of seasonal influenza B viruses with protein language model embedding based convolutional neural network.

Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf308.

Graph-RPI: predicting RNA-protein interactions via graph autoencoder and self-supervised learning strategies.

Brief Bioinform. 2025 May 1;26(3). doi: 10.1093/bib/bbaf292.

m5CStack: An integrated framework for m5C site prediction using multi-feature stacking.

Comput Struct Biotechnol J. 2025 May 12;27:1901-1912. doi: 10.1016/j.csbj.2025.05.004. eCollection 2025.

Prediction and validation of nanowire proteins in G20 using machine learning and feature engineering.

Comput Struct Biotechnol J. 2025 Apr 19;27:1706-1718. doi: 10.1016/j.csbj.2025.04.022. eCollection 2025.

TetraRNA, a tetra-class machine learning model for deciphering the coding potential derivation of RNA world.

Comput Struct Biotechnol J. 2025 Mar 26;27:1305-1317. doi: 10.1016/j.csbj.2025.03.039. eCollection 2025.

SUMO-LMNet: Lossless mapping network for predicting SUMOylation sites in SUMO1 and SUMO2 using high-dimensional features.

Comput Struct Biotechnol J. 2025 Mar 6;27:1048-1059. doi: 10.1016/j.csbj.2025.03.005. eCollection 2025.

Phased T2T genome assemblies facilitate the mining of disease-resistance genes in .

Hortic Res. 2024 Nov 6;12(2):uhae306. doi: 10.1093/hr/uhae306. eCollection 2025 Feb.

Empirical Comparison and Analysis of Artificial Intelligence-Based Methods for Identifying Phosphorylation Sites of SARS-CoV-2 Infection.

Int J Mol Sci. 2024 Dec 21;25(24):13674. doi: 10.3390/ijms252413674.

本文引用的文献

Pfeature: A Tool for Computing Wide Range of Protein Features and Building Prediction Models.

J Comput Biol. 2023 Feb;30(2):204-222. doi: 10.1089/cmb.2022.0241. Epub 2022 Oct 13.

AlphaFold Protein Structure Database: massively expanding the structural coverage of protein-sequence space with high-accuracy models.

Nucleic Acids Res. 2022 Jan 7;50(D1):D439-D444. doi: 10.1093/nar/gkab1061.

MathFeature: feature extraction package for DNA, RNA and protein sequences based on mathematical descriptors.

Brief Bioinform. 2022 Jan 17;23(1). doi: 10.1093/bib/bbab434.

Representation learning applications in biological sequence analysis.

Comput Struct Biotechnol J. 2021 May 23;19:3198-3208. doi: 10.1016/j.csbj.2021.05.039. eCollection 2021.

Structure-based protein function prediction using graph convolutional networks.

Nat Commun. 2021 May 26;12(1):3168. doi: 10.1038/s41467-021-23303-9.

iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization.

Nucleic Acids Res. 2021 Jun 4;49(10):e60. doi: 10.1093/nar/gkab122.

Sequence representation approaches for sequence-based protein prediction tasks that use deep learning.

Brief Funct Genomics. 2021 Mar 2;20(1):61-73. doi: 10.1093/bfgp/elaa030.

The lncRNA Toolkit: Databases and In Silico Tools for lncRNA Analysis.

Noncoding RNA. 2020 Dec 16;6(4):49. doi: 10.3390/ncrna6040049.

UniProt: the universal protein knowledgebase in 2021.

Nucleic Acids Res. 2021 Jan 8;49(D1):D480-D489. doi: 10.1093/nar/gkaa1100.

Ensembl 2021.

Nucleic Acids Res. 2021 Jan 8;49(D1):D884-D891. doi: 10.1093/nar/gkaa942.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

iFeatureOmega：一个综合性平台，用于对分子序列、结构和配体数据集的特征进行工程设计、可视化和分析。

iFeatureOmega: an integrative platform for engineering, visualization and analysis of features from molecular sequences, structural and ligand data sets.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献