文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

自动构建分子相似性网络,用于生物活性肽化学空间中的可视化图挖掘:一种无监督学习方法。

Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach.

机构信息

Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Baja California, 22860, Mexico.

Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Av. Interoceánica Km 12 1/2 y Av. Florencia, 17-1200-841, Quito, Ecuador.

出版信息

Sci Rep. 2020 Oct 22;10(1):18074. doi: 10.1038/s41598-020-75029-1.


DOI:10.1038/s41598-020-75029-1
PMID:33093586
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7583304/
Abstract

The increasing interest in bioactive peptides with therapeutic potentials has been reflected in a large variety of biological databases published over the last years. However, the knowledge discovery process from these heterogeneous data sources is a nontrivial task, becoming the essence of our research endeavor. Therefore, we devise a unified data model based on molecular similarity networks for representing a chemical reference space of bioactive peptides, having an implicit knowledge that is currently not explicitly accessed in existing biological databases. Indeed, our main contribution is a novel workflow for the automatic construction of such similarity networks, enabling visual graph mining techniques to uncover new insights from the "ocean" of known bioactive peptides. The workflow presented here relies on the following sequential steps: (i) calculation of molecular descriptors by applying statistical and aggregation operators on amino acid property vectors; (ii) a two-stage unsupervised feature selection method to identify an optimized subset of descriptors using the concepts of entropy and mutual information; (iii) generation of sparse networks where nodes represent bioactive peptides, and edges between two nodes denote their pairwise similarity/distance relationships in the defined descriptor space; and (iv) exploratory analysis using visual inspection in combination with clustering and network science techniques. For practical purposes, the proposed workflow has been implemented in our visual analytics software tool ( http://mobiosd-hub.com/starpep/ ), to assist researchers in extracting useful information from an integrated collection of 45120 bioactive peptides, which is one of the largest and most diverse data in its field. Finally, we illustrate the applicability of the proposed workflow for discovering central nodes in molecular similarity networks that may represent a biologically relevant chemical space known to date.

摘要

近年来,具有治疗潜力的生物活性肽引起了越来越多的关注,这反映在过去几年发布的各种生物数据库中。然而,从这些异构数据源中发现知识是一项艰巨的任务,这也是我们研究工作的核心。因此,我们设计了一个基于分子相似性网络的统一数据模型,用于表示生物活性肽的化学参考空间,其中隐含着目前在现有生物数据库中尚未明确访问的知识。实际上,我们的主要贡献是一种新颖的自动构建此类相似性网络的工作流程,使可视化图挖掘技术能够从已知生物活性肽的“海洋”中发现新的见解。这里呈现的工作流程依赖于以下顺序步骤:(i)通过在氨基酸属性向量上应用统计和聚合运算符来计算分子描述符;(ii)使用熵和互信息的概念进行两阶段无监督特征选择方法,以识别优化的描述符子集;(iii)生成稀疏网络,其中节点表示生物活性肽,并且两个节点之间的边表示它们在定义的描述符空间中的成对相似/距离关系;以及(iv)使用可视化检查结合聚类和网络科学技术进行探索性分析。出于实际目的,该工作流程已在我们的可视化分析软件工具(http://mobiosd-hub.com/starpep/)中实现,以帮助研究人员从一个集成的 45120 种生物活性肽的集合中提取有用信息,这是该领域最大和最多样化的数据之一。最后,我们说明了所提出的工作流程在发现分子相似性网络中可能代表迄今已知的生物学相关化学空间的中心节点的应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/8dd28a3dc6ec/41598_2020_75029_Fig17_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/bc854f1cba7f/41598_2020_75029_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/67374c1f8ab6/41598_2020_75029_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/1e5263b2def1/41598_2020_75029_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/dcd9a53adb83/41598_2020_75029_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/62e472fe1f3a/41598_2020_75029_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/64062aae95f9/41598_2020_75029_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/b5333fe2a43d/41598_2020_75029_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/001c04387565/41598_2020_75029_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/a7b46a2c84bb/41598_2020_75029_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/eb2f4609c7d2/41598_2020_75029_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/b6e1f77b983c/41598_2020_75029_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/829dec61bf86/41598_2020_75029_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/c630d0a00aae/41598_2020_75029_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/e6d486c586ba/41598_2020_75029_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/dc94a557e857/41598_2020_75029_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/ef7078839c29/41598_2020_75029_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/8dd28a3dc6ec/41598_2020_75029_Fig17_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/bc854f1cba7f/41598_2020_75029_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/67374c1f8ab6/41598_2020_75029_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/1e5263b2def1/41598_2020_75029_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/dcd9a53adb83/41598_2020_75029_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/62e472fe1f3a/41598_2020_75029_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/64062aae95f9/41598_2020_75029_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/b5333fe2a43d/41598_2020_75029_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/001c04387565/41598_2020_75029_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/a7b46a2c84bb/41598_2020_75029_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/eb2f4609c7d2/41598_2020_75029_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/b6e1f77b983c/41598_2020_75029_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/829dec61bf86/41598_2020_75029_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/c630d0a00aae/41598_2020_75029_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/e6d486c586ba/41598_2020_75029_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/dc94a557e857/41598_2020_75029_Fig15_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/ef7078839c29/41598_2020_75029_Fig16_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3c31/7583304/8dd28a3dc6ec/41598_2020_75029_Fig17_HTML.jpg

相似文献

[1]
Automatic construction of molecular similarity networks for visual graph mining in chemical space of bioactive peptides: an unsupervised learning approach.

Sci Rep. 2020-10-22

[2]
Graph-based data integration from bioactive peptide databases of pharmaceutical interest: toward an organized collection enabling visual network analysis.

Bioinformatics. 2019-11-1

[3]
Network Science and Group Fusion Similarity-Based Searching to Explore the Chemical Space of Antiparasitic Peptides.

ACS Omega. 2022-12-6

[4]
StarPep Toolbox: an open-source software to assist chemical space analysis of bioactive peptides and their functions using complex networks.

Bioinformatics. 2023-8-1

[5]
Unsupervised construction of computational graphs for gene expression data with explicit structural inductive biases.

Bioinformatics. 2022-2-7

[6]
TargetHunter: an in silico target identification tool for predicting therapeutic potential of small organic molecules based on chemogenomic database.

AAPS J. 2013-1-5

[7]
Clustering approaches for visual knowledge exploration in molecular interaction networks.

BMC Bioinformatics. 2018-8-29

[8]
Algorithms for effective querying of compound graph-based pathway databases.

BMC Bioinformatics. 2009-11-16

[9]
Text mining-based word representations for biomedical data analysis and protein-protein interaction networks in machine learning tasks.

PLoS One. 2021

[10]
SpirPep: an in silico digestion-based platform to assist bioactive peptides discovery from a genome-wide database.

BMC Bioinformatics. 2018-4-20

引用本文的文献

[1]
Optimal Descriptor Subset Search via Chemical Information and Target Activity-Guided Algorithm for Antimicrobial Peptide Prediction.

J Chem Inf Model. 2025-7-14

[2]
Unlocking Antimicrobial Peptides: In Silico Proteolysis and Artificial Intelligence-Driven Discovery from Cnidarian Omics.

Molecules. 2025-1-25

[3]
Advances of deep Neural Networks (DNNs) in the development of peptide drugs.

Future Med Chem. 2025-2

[4]
Unveiling Encrypted Antimicrobial Peptides from Cephalopods' Salivary Glands: A Proteolysis-Driven Virtual Approach.

ACS Omega. 2024-10-14

[5]
Peptide hemolytic activity analysis using visual data mining of similarity-based complex networks.

NPJ Syst Biol Appl. 2024-10-4

[6]
Innovative Alignment-Based Method for Antiviral Peptide Prediction.

Antibiotics (Basel). 2024-8-14

[7]
Metabolic Connectome and Its Role in the Prediction, Diagnosis, and Treatment of Complex Diseases.

Metabolites. 2024-1-26

[8]
Machine Learning-Enabled Genome Mining and Bioactivity Prediction of Natural Products.

ACS Synth Biol. 2023-9-15

[9]
StarPep Toolbox: an open-source software to assist chemical space analysis of bioactive peptides and their functions using complex networks.

Bioinformatics. 2023-8-1

[10]
Complex Networks Analyses of Antibiofilm Peptides: An Emerging Tool for Next-Generation Antimicrobials' Discovery.

Antibiotics (Basel). 2023-4-13

本文引用的文献

[1]
Drug Research Meets Network Science: Where Are We?

J Med Chem. 2020-5-8

[2]
Computational Design of Biologically Active Anticancer Peptides and Their Interactions with Heterogeneous POPC/POPS Lipid Membranes.

J Chem Inf Model. 2020-1-27

[3]
When global and local molecular descriptors are more than the sum of its parts: Simple, But Not Simpler?

Mol Divers. 2020-11

[4]
Recent Advances and Computational Approaches in Peptide Drug Discovery.

Curr Pharm Des. 2019

[5]
Centrality in Complex Networks with Overlapping Community Structure.

Sci Rep. 2019-7-12

[6]
ProtDCal-Suite: A web server for the numerical codification and functional analysis of proteins.

Protein Sci. 2019-9

[7]
A Comprehensive Review on Current Advances in Peptide Drug Development and Design.

Int J Mol Sci. 2019-5-14

[8]
Toward computer-made artificial antibiotics.

Curr Opin Microbiol. 2019-5-11

[9]
Graph-based data integration from bioactive peptide databases of pharmaceutical interest: toward an organized collection enabling visual network analysis.

Bioinformatics. 2019-11-1

[10]
BioJava 5: A community driven open-source bioinformatics library.

PLoS Comput Biol. 2019-2-8

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索