• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

PROTA:一种使用机器学习和深度学习混合方法的鱼精蛋白预测的强大工具。

PROTA: A Robust Tool for Protamine Prediction Using a Hybrid Approach of Machine Learning and Deep Learning.

机构信息

Department of Chemical Engineering, Faculty of Engineering and Science, Universidad de La Frontera, Ave. Francisco Salazar 01145, Temuco 4811230, Chile.

Departamento de Ciencias Básicas, Facultad de Ciencias, Universidad Santo Tomas, Temuco 4780000, Chile.

出版信息

Int J Mol Sci. 2024 Sep 24;25(19):10267. doi: 10.3390/ijms251910267.

DOI:10.3390/ijms251910267
PMID:39408595
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11476296/
Abstract

Protamines play a critical role in DNA compaction and stabilization in sperm cells, significantly influencing male fertility and various biotechnological applications. Traditionally, identifying these proteins is a challenging and time-consuming process due to their species-specific variability and complexity. Leveraging advancements in computational biology, we present PROTA, a novel tool that combines machine learning (ML) and deep learning (DL) techniques to predict protamines with high accuracy. For the first time, we integrate Generative Adversarial Networks (GANs) with supervised learning methods to enhance the accuracy and generalizability of protamine prediction. Our methodology evaluated multiple ML models, including Light Gradient-Boosting Machine (LIGHTGBM), Multilayer Perceptron (MLP), Random Forest (RF), eXtreme Gradient Boosting (XGBOOST), k-Nearest Neighbors (KNN), Logistic Regression (LR), Naive Bayes (NB), and Radial Basis Function-Support Vector Machine (RBF-SVM). During ten-fold cross-validation on our training dataset, the MLP model with GAN-augmented data demonstrated superior performance metrics: 0.997 accuracy, 0.997 F1 score, 0.998 precision, 0.997 sensitivity, and 1.0 AUC. In the independent testing phase, this model achieved 0.999 accuracy, 0.999 F1 score, 1.0 precision, 0.999 sensitivity, and 1.0 AUC. These results establish PROTA, accessible via a user-friendly web application. We anticipate that PROTA will be a crucial resource for researchers, enabling the rapid and reliable prediction of protamines, thereby advancing our understanding of their roles in reproductive biology, biotechnology, and medicine.

摘要

鱼精蛋白在精子细胞的 DNA 压缩和稳定中起着至关重要的作用,显著影响男性生育能力和各种生物技术应用。传统上,由于其物种特异性的变异性和复杂性,鉴定这些蛋白质是一个具有挑战性和耗时的过程。利用计算生物学的进步,我们提出了 PROTA,这是一种新颖的工具,它结合了机器学习(ML)和深度学习(DL)技术,以高精度预测鱼精蛋白。我们首次将生成对抗网络(GANs)与监督学习方法相结合,以提高鱼精蛋白预测的准确性和泛化能力。我们的方法评估了多个 ML 模型,包括 Light Gradient-Boosting Machine (LIGHTGBM)、Multilayer Perceptron (MLP)、Random Forest (RF)、eXtreme Gradient Boosting (XGBOOST)、k-Nearest Neighbors (KNN)、Logistic Regression (LR)、Naive Bayes (NB) 和 Radial Basis Function-Support Vector Machine (RBF-SVM)。在我们的训练数据集上进行十折交叉验证时,具有 GAN 增强数据的 MLP 模型表现出了优越的性能指标:0.997 准确率、0.997 F1 得分、0.998 精度、0.997 敏感性和 1.0 AUC。在独立测试阶段,该模型实现了 0.999 准确率、0.999 F1 得分、1.0 精度、0.999 敏感性和 1.0 AUC。这些结果确立了 PROTA 的地位,可通过用户友好的网络应用程序访问。我们预计 PROTA 将成为研究人员的重要资源,能够快速可靠地预测鱼精蛋白,从而推进我们对其在生殖生物学、生物技术和医学中的作用的理解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f63/11476296/9c9593b031ce/ijms-25-10267-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f63/11476296/018c4ed719ab/ijms-25-10267-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f63/11476296/659e773a977d/ijms-25-10267-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f63/11476296/9c9593b031ce/ijms-25-10267-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f63/11476296/018c4ed719ab/ijms-25-10267-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f63/11476296/659e773a977d/ijms-25-10267-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3f63/11476296/9c9593b031ce/ijms-25-10267-g003.jpg

相似文献

1
PROTA: A Robust Tool for Protamine Prediction Using a Hybrid Approach of Machine Learning and Deep Learning.PROTA:一种使用机器学习和深度学习混合方法的鱼精蛋白预测的强大工具。
Int J Mol Sci. 2024 Sep 24;25(19):10267. doi: 10.3390/ijms251910267.
2
Prediction and Diagnosis of Breast Cancer Using Machine and Modern Deep Learning Models.使用机器和现代深度学习模型预测和诊断乳腺癌。
Asian Pac J Cancer Prev. 2024 Mar 1;25(3):1077-1085. doi: 10.31557/APJCP.2024.25.3.1077.
3
Prediction and feature selection of low birth weight using machine learning algorithms.利用机器学习算法预测和选择低出生体重。
J Health Popul Nutr. 2024 Oct 12;43(1):157. doi: 10.1186/s41043-024-00647-8.
4
A Risk Prediction Model for Physical Restraints Among Older Chinese Adults in Long-term Care Facilities: Machine Learning Study.长期护理机构中老年人身体约束的风险预测模型:机器学习研究。
J Med Internet Res. 2023 Apr 6;25:e43815. doi: 10.2196/43815.
5
Machine learning algorithms for predicting COVID-19 mortality in Ethiopia.用于预测埃塞俄比亚 COVID-19 死亡率的机器学习算法。
BMC Public Health. 2024 Jun 28;24(1):1728. doi: 10.1186/s12889-024-19196-0.
6
Application of a developed triple-classification machine learning model for carcinogenic prediction of hazardous organic chemicals to the US, EU, and WHO based on Chinese database.应用基于中国数据库开发的三分类机器学习模型对美国、欧盟和世界卫生组织的危险有机化学品进行致癌性预测。
Ecotoxicol Environ Saf. 2023 Apr 15;255:114806. doi: 10.1016/j.ecoenv.2023.114806. Epub 2023 Mar 20.
7
Bayesian optimized multimodal deep hybrid learning approach for tomato leaf disease classification.贝叶斯优化多模态深度混合学习方法在番茄叶部病害分类中的应用。
Sci Rep. 2024 Sep 14;14(1):21525. doi: 10.1038/s41598-024-72237-x.
8
Noninvasive prediction of lymph node metastasis in pancreatic cancer using an ultrasound-based clinicoradiomics machine learning model.基于超声的临床放射组学机器学习模型无创预测胰腺癌淋巴结转移。
Biomed Eng Online. 2024 Jun 18;23(1):56. doi: 10.1186/s12938-024-01259-3.
9
A machine learning approach in a monocentric cohort for predicting primary refractory disease in Diffuse Large B-cell lymphoma patients.一项在单中心队列中进行的机器学习方法,用于预测弥漫性大 B 细胞淋巴瘤患者的原发性难治性疾病。
PLoS One. 2024 Oct 1;19(10):e0311261. doi: 10.1371/journal.pone.0311261. eCollection 2024.
10
Joint modeling strategy for using electronic medical records data to build machine learning models: an example of intracerebral hemorrhage.利用电子病历数据构建机器学习模型的联合建模策略:以脑出血为例。
BMC Med Inform Decis Mak. 2022 Oct 25;22(1):278. doi: 10.1186/s12911-022-02018-x.

引用本文的文献

1
EnsembleNPPred: A Robust Approach to Neuropeptide Prediction and Recognition Using Ensemble Machine Learning and Deep Learning Methods.集成神经肽预测:一种使用集成机器学习和深度学习方法进行神经肽预测与识别的稳健方法。
Life (Basel). 2025 Jun 25;15(7):1010. doi: 10.3390/life15071010.

本文引用的文献

1
MultiToxPred 1.0: a novel comprehensive tool for predicting 27 classes of protein toxins using an ensemble machine learning approach.MultiToxPred 1.0:一种新颖的综合工具,使用集成机器学习方法预测 27 类蛋白质毒素。
BMC Bioinformatics. 2024 Apr 12;25(1):148. doi: 10.1186/s12859-024-05748-z.
2
A novel generative adversarial networks modelling for the class imbalance problem in high dimensional omics data.一种新的生成对抗网络模型,用于解决高维组学数据中的类别不平衡问题。
BMC Med Inform Decis Mak. 2024 Mar 28;24(1):90. doi: 10.1186/s12911-024-02487-2.
3
VirusHound-I: prediction of viral proteins involved in the evasion of host adaptive immune response using the random forest algorithm and generative adversarial network for data augmentation.
VirusHound-I:使用随机森林算法和生成对抗网络进行数据增强来预测逃避宿主适应性免疫反应的病毒蛋白。
Brief Bioinform. 2023 Nov 22;25(1). doi: 10.1093/bib/bbad434.
4
Synergistically enhanced cancer immunotherapy by combining protamine-based nanovaccine with PD-L1 gene silence nanoparticle.通过将鱼精蛋白纳米疫苗与 PD-L1 基因沉默纳米颗粒联合使用,协同增强癌症免疫治疗。
Int J Biol Macromol. 2023 Jul 1;242(Pt 4):125223. doi: 10.1016/j.ijbiomac.2023.125223. Epub 2023 Jun 3.
5
CysPresso: a classification model utilizing deep learning protein representations to predict recombinant expression of cysteine-dense peptides.CysPresso:一种利用深度学习蛋白质表示来预测半胱氨酸密集肽重组表达的分类模型。
BMC Bioinformatics. 2023 May 16;24(1):200. doi: 10.1186/s12859-023-05327-8.
6
AI4AVP: an antiviral peptides predictor in deep learning approach with generative adversarial network data augmentation.AI4AVP:一种采用生成对抗网络数据增强的深度学习方法的抗病毒肽预测器。
Bioinform Adv. 2022 Oct 26;2(1):vbac080. doi: 10.1093/bioadv/vbac080. eCollection 2022.
7
Hydrophobicity of arginine leads to reentrant liquid-liquid phase separation behaviors of arginine-rich proteins.精氨酸的疏水性导致富含精氨酸的蛋白质出现重入性液-液相分离行为。
Nat Commun. 2022 Nov 28;13(1):7326. doi: 10.1038/s41467-022-35001-1.
8
Exploring Structures and Dynamics of Protamine Molecules through Molecular Dynamics Simulations.通过分子动力学模拟探索鱼精蛋白分子的结构与动力学
ACS Omega. 2022 Nov 8;7(46):42083-42095. doi: 10.1021/acsomega.2c04227. eCollection 2022 Nov 22.
9
Formation of Protamine and Zn-Insulin Assembly: Exploring Biophysical Consequences.鱼精蛋白与锌胰岛素聚集体的形成:探索生物物理结果。
ACS Omega. 2022 Nov 4;7(45):41044-41057. doi: 10.1021/acsomega.2c04419. eCollection 2022 Nov 15.
10
End-to-end learning of multiple sequence alignments with differentiable Smith-Waterman.基于可微分 Smith-Waterman 的多序列比对端到端学习。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac724.