• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于乐高的两组线性代数 3D 生物大分子描述符的广义集:QSAR 的理论和验证。

LEGO-based generalized set of two linear algebraic 3D bio-macro-molecular descriptors: Theory and validation by QSARs.

机构信息

Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Av. Interoceánica Km 12 ½ -Cumbayá, Quito 170157, Ecuador; Grupo GINUMED, Corporacion Universitaria Rafal Nuñez. Facultad de Salud. Programa de Medicina, Cartagena, Colombia; Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain.

Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Av. Interoceánica Km 12 ½ -Cumbayá, Quito 170157, Ecuador; Universidad San Francisco de Quito (USFQ), Grupo de Química Computacional y Teórica, Departamento de Ingeniería Química, Diego de Robles y vía Interoceánica, Quito, 170157, Pichincha, Ecuador.

出版信息

J Theor Biol. 2020 Jan 21;485:110039. doi: 10.1016/j.jtbi.2019.110039. Epub 2019 Oct 4.

DOI:10.1016/j.jtbi.2019.110039
PMID:31589877
Abstract

Novel 3D protein descriptors based on bilinear, quadratic and linear algebraic maps in R are proposed. The latter employs the k 2-tuple (dis) similarity matrix to codify information related to covalent and non-covalent interactions in these biopolymers. The calculation of the inter-amino acid distances is generalized by using several dis-similarity coefficients, where normalization procedures based on the simple stochastic and mutual probability schemes are applied. A new local-fragment approach based on amino acid-types and amino acid-groups is proposed to characterize regions of interest in proteins. Topological and geometric macromolecular cutoffs are defined using local and total indices to highlight non-covalent interactions existing between the side-chains of each amino acid. Moreover, local and total indices calculations are generalized considering a LEGO approach, by using several aggregation operators. Collinearity and variability analyses are performed to evaluate every generalizing component applied to the definition of these novel indices. These experiments are oriented to reduce the number of MDs obtained for performing prediction models. The predictive power of the proposed indices was evaluated using two benchmark datasets, folding rate and secondary structural classification of proteins. The proposed MDs are modeled using the following strategies: Multiple Linear Regression (MLR) and Support Vector Machine (SVM), respectively. The best regression model developed for the folding rate of proteins yields a cross-validation coefficient of 0.875 (Test Set) and the best model developed for secondary structural classification obtained 98% of instances correctly classified (Test Set). These statistical parameters are superior to the ones obtained with existing MDs reported in the literature. Overall, the new theoretical generalization enhanced the information extraction into the MDs, allowing a better correlation between these two evaluated benchmark datasets and the proposed indices. The optimal theoretical configurations defined for the calculation of these MDs consider low collinearity and less information redundancy among them. These theoretical configurations and the software are available at http://tomocomd.com/mulims-mcompas.

摘要

提出了基于双线性、二次和线性代数映射的新型 3D 蛋白质描述符。后者采用 k 2 元组(不)相似矩阵来编码这些生物聚合物中涉及共价和非共价相互作用的信息。通过使用几种不相似系数,广义计算氨基酸间的距离,其中应用了基于简单随机和相互概率方案的归一化程序。提出了一种新的基于氨基酸类型和氨基酸组的局部片段方法来描述蛋白质中的感兴趣区域。使用局部和总指数定义拓扑和几何大分子截止值,以突出每个氨基酸侧链之间存在的非共价相互作用。此外,通过使用几种聚合运算符,广义化了局部和总指数的计算。进行共线性和可变性分析,以评估应用于这些新型指数定义的每个概括组件。这些实验旨在减少进行预测模型所需的 MD 的数量。使用两个基准数据集(蛋白质折叠率和二级结构分类)评估所提出的指数的预测能力。使用以下策略对所提出的 MD 进行建模:多元线性回归(MLR)和支持向量机(SVM)。为蛋白质折叠率开发的最佳回归模型的交叉验证系数为 0.875(测试集),为二级结构分类开发的最佳模型获得了 98%的实例正确分类(测试集)。这些统计参数优于文献中报道的现有 MD 获得的参数。总体而言,新的理论概括增强了 MD 中的信息提取,允许更好地关联这两个评估的基准数据集和所提出的指数。为计算这些 MD 定义的最佳理论配置考虑了低共线性和它们之间的信息冗余较少。这些理论配置和软件可在 http://tomocomd.com/mulims-mcompas 上获得。

相似文献

1
LEGO-based generalized set of two linear algebraic 3D bio-macro-molecular descriptors: Theory and validation by QSARs.基于乐高的两组线性代数 3D 生物大分子描述符的广义集:QSAR 的理论和验证。
J Theor Biol. 2020 Jan 21;485:110039. doi: 10.1016/j.jtbi.2019.110039. Epub 2019 Oct 4.
2
: A Novel Multiplatform Framework to Compute Tensor Algebra-Based Three-Dimensional Protein Descriptors.: 一种用于计算基于张量代数的三维蛋白质描述符的新型多平台框架。
J Chem Inf Model. 2020 Feb 24;60(2):1042-1059. doi: 10.1021/acs.jcim.9b00629. Epub 2019 Oct 30.
3
QuBiLS-MAS, open source multi-platform software for atom- and bond-based topological (2D) and chiral (2.5D) algebraic molecular descriptors computations.QuBiLS-MAS,一款用于基于原子和键的拓扑(二维)和手性(2.5维)代数分子描述符计算的开源多平台软件。
J Cheminform. 2017 Jun 7;9(1):35. doi: 10.1186/s13321-017-0211-5.
4
Novel 3D bio-macromolecular bilinear descriptors for protein science: Predicting protein structural classes.用于蛋白质科学的新型3D生物大分子双线性描述符:预测蛋白质结构类别。
J Theor Biol. 2015 Jun 7;374:125-37. doi: 10.1016/j.jtbi.2015.03.026. Epub 2015 Apr 3.
5
N-linear algebraic maps for chemical structure codification: a suitable generalization for atom-pair approaches?用于化学结构编码的N线性代数映射:对原子对方法的适当推广?
Curr Drug Metab. 2014;15(4):441-69. doi: 10.2174/1389200215666140605124506.
6
N-tuple topological/geometric cutoffs for 3D N-linear algebraic molecular codifications: variability, linear independence and QSAR analysis.用于3D N线性代数分子编码的N元组拓扑/几何截止值:变异性、线性独立性和定量构效关系分析。
SAR QSAR Environ Res. 2016 Dec;27(12):949-975. doi: 10.1080/1062936X.2016.1231714. Epub 2016 Oct 6.
7
Fuzzy spherical truncation-based multi-linear protein descriptors: From their definition to application in structural-related predictions.基于模糊球形截断的多线性蛋白质描述符:从定义到在结构相关预测中的应用
Front Chem. 2022 Oct 7;10:959143. doi: 10.3389/fchem.2022.959143. eCollection 2022.
8
QuBiLS-MIDAS: a parallel free-software for molecular descriptors computation based on multilinear algebraic maps.QuBiLS-MIDAS:一种基于多元线性代数映射的分子描述符计算并行免费软件。
J Comput Chem. 2014 Jul 5;35(18):1395-409. doi: 10.1002/jcc.23640. Epub 2014 Jun 2.
9
Protein linear indices of the 'macromolecular pseudograph alpha-carbon atom adjacency matrix' in bioinformatics. Part 1: prediction of protein stability effects of a complete set of alanine substitutions in Arc repressor.生物信息学中“大分子伪图α-碳原子邻接矩阵”的蛋白质线性指数。第1部分:Arc阻遏物中一整套丙氨酸取代对蛋白质稳定性影响的预测。
Bioorg Med Chem. 2005 Apr 15;13(8):3003-15. doi: 10.1016/j.bmc.2005.01.062.
10
Exploring the QSAR's predictive truthfulness of the novel N-tuple discrete derivative indices on benchmark datasets.探索新型N元组离散导数指标在基准数据集上的定量构效关系(QSAR)预测真实性。
SAR QSAR Environ Res. 2017 May;28(5):367-389. doi: 10.1080/1062936X.2017.1326403.

引用本文的文献

1
Chemical feature-based machine learning model for predicting photophysical properties of BODIPY compounds: density functional theory and quantitative structure-property relationship modeling.用于预测BODIPY化合物光物理性质的基于化学特征的机器学习模型:密度泛函理论和定量结构-性质关系建模
J Mol Model. 2024 Dec 12;31(1):18. doi: 10.1007/s00894-024-06240-4.
2
An overview of descriptors to capture protein properties - Tools and perspectives in the context of QSAR modeling.用于描述蛋白质特性的描述符概述——定量构效关系建模背景下的工具与展望
Comput Struct Biotechnol J. 2023 May 24;21:3234-3247. doi: 10.1016/j.csbj.2023.05.022. eCollection 2023.
3
Fuzzy spherical truncation-based multi-linear protein descriptors: From their definition to application in structural-related predictions.
基于模糊球形截断的多线性蛋白质描述符:从定义到在结构相关预测中的应用
Front Chem. 2022 Oct 7;10:959143. doi: 10.3389/fchem.2022.959143. eCollection 2022.