• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于具有不同化学空间任务的新型多任务学习算法:以斑马鱼毒性预测为例。

A novel multitask learning algorithm for tasks with distinct chemical space: zebrafish toxicity prediction as an example.

作者信息

Lin Run-Hsin, Lin Pinpin, Wang Chia-Chi, Tung Chun-Wei

机构信息

Institute of Biotechnology and Pharmaceutical Research, National Health Research Institutes, Miaoli County, 35053, Taiwan.

Graduate Institute of Data Science, College of Management, Taipei Medical University, Taipei, 10675, Taiwan.

出版信息

J Cheminform. 2024 Aug 2;16(1):91. doi: 10.1186/s13321-024-00891-4.

DOI:10.1186/s13321-024-00891-4
PMID:39095893
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11297603/
Abstract

Data scarcity is one of the most critical issues impeding the development of prediction models for chemical effects. Multitask learning algorithms leveraging knowledge from relevant tasks showed potential for dealing with tasks with limited data. However, current multitask methods mainly focus on learning from datasets whose task labels are available for most of the training samples. Since datasets were generated for different purposes with distinct chemical spaces, the conventional multitask learning methods may not be suitable. This study presents a novel multitask learning method MTForestNet that can deal with data scarcity problems and learn from tasks with distinct chemical space. The MTForestNet consists of nodes of random forest classifiers organized in the form of a progressive network, where each node represents a random forest model learned from a specific task. To demonstrate the effectiveness of the MTForestNet, 48 zebrafish toxicity datasets were collected and utilized as an example. Among them, two tasks are very different from other tasks with only 1.3% common chemicals shared with other tasks. In an independent test, MTForestNet with a high area under the receiver operating characteristic curve (AUC) value of 0.911 provided superior performance over compared single-task and multitask methods. The overall toxicity derived from the developed models of zebrafish toxicity is well correlated with the experimentally determined overall toxicity. In addition, the outputs from the developed models of zebrafish toxicity can be utilized as features to boost the prediction of developmental toxicity. The developed models are effective for predicting zebrafish toxicity and the proposed MTForestNet is expected to be useful for tasks with distinct chemical space that can be applied in other tasks.Scieific contributionA novel multitask learning algorithm MTForestNet was proposed to address the challenges of developing models using datasets with distinct chemical space that is a common issue of cheminformatics tasks. As an example, zebrafish toxicity prediction models were developed using the proposed MTForestNet which provide superior performance over conventional single-task and multitask learning methods. In addition, the developed zebrafish toxicity prediction models can reduce animal testing.

摘要

数据稀缺是阻碍化学效应预测模型发展的最关键问题之一。利用相关任务知识的多任务学习算法显示出处理数据有限任务的潜力。然而,当前的多任务方法主要侧重于从大多数训练样本都有任务标签的数据集进行学习。由于数据集是为不同目的生成的,具有不同的化学空间,传统的多任务学习方法可能并不适用。本研究提出了一种新颖的多任务学习方法MTForestNet,它可以处理数据稀缺问题,并从具有不同化学空间的任务中进行学习。MTForestNet由以渐进网络形式组织的随机森林分类器节点组成,其中每个节点代表从特定任务学习到的随机森林模型。为了证明MTForestNet的有效性,收集并使用了48个斑马鱼毒性数据集作为示例。其中,有两个任务与其他任务非常不同,与其他任务仅共享1.3%的常见化学物质。在独立测试中,MTForestNet的接收器操作特征曲线(AUC)值高达0.911,比单任务和多任务方法具有更好的性能。斑马鱼毒性模型得出的总体毒性与实验确定的总体毒性高度相关。此外,斑马鱼毒性模型的输出可作为特征,以提高发育毒性的预测。所开发的模型对于预测斑马鱼毒性是有效的,并且所提出的MTForestNet有望用于具有不同化学空间的任务,可应用于其他任务。

科学贡献

提出了一种新颖的多任务学习算法MTForestNet,以应对使用具有不同化学空间的数据集开发模型的挑战,这是化学信息学任务的常见问题。例如,使用所提出的MTForestNet开发了斑马鱼毒性预测模型,该模型比传统的单任务和多任务学习方法具有更好的性能。此外,所开发的斑马鱼毒性预测模型可以减少动物实验。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/7654e6f2531f/13321_2024_891_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/01c15d5885bc/13321_2024_891_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/7670a5e1cf88/13321_2024_891_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/f56c36ba050c/13321_2024_891_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/4cf787a8d408/13321_2024_891_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/9f878a4a5cf2/13321_2024_891_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/535625ff835d/13321_2024_891_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/77799568faba/13321_2024_891_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/17eea6e44222/13321_2024_891_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/7654e6f2531f/13321_2024_891_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/01c15d5885bc/13321_2024_891_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/7670a5e1cf88/13321_2024_891_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/f56c36ba050c/13321_2024_891_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/4cf787a8d408/13321_2024_891_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/9f878a4a5cf2/13321_2024_891_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/535625ff835d/13321_2024_891_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/77799568faba/13321_2024_891_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/17eea6e44222/13321_2024_891_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/463b/11297603/7654e6f2531f/13321_2024_891_Fig9_HTML.jpg

相似文献

1
A novel multitask learning algorithm for tasks with distinct chemical space: zebrafish toxicity prediction as an example.一种用于具有不同化学空间任务的新型多任务学习算法:以斑马鱼毒性预测为例。
J Cheminform. 2024 Aug 2;16(1):91. doi: 10.1186/s13321-024-00891-4.
2
Compositional model based on factorial evolution for realizing multi-task learning in bacterial virulent protein prediction.基于因子进化的组合模型在细菌毒力蛋白预测中实现多任务学习。
Artif Intell Med. 2019 Nov;101:101757. doi: 10.1016/j.artmed.2019.101757. Epub 2019 Nov 7.
3
Improving five-year survival prediction via multitask learning across HPV-related cancers.通过 HPV 相关癌症的多任务学习提高五年生存率预测。
PLoS One. 2020 Nov 16;15(11):e0241225. doi: 10.1371/journal.pone.0241225. eCollection 2020.
4
Inferring latent task structure for Multitask Learning by Multiple Kernel Learning.通过多核学习推断多任务学习中的潜在任务结构。
BMC Bioinformatics. 2010 Oct 26;11 Suppl 8(Suppl 8):S5. doi: 10.1186/1471-2105-11-S8-S5.
5
Task Sensitive Feature Exploration and Learning for Multitask Graph Classification.面向多任务图分类的任务敏感特征探索和学习。
IEEE Trans Cybern. 2017 Mar;47(3):744-758. doi: 10.1109/TCYB.2016.2526058. Epub 2016 Mar 10.
6
Prediction of Human Cytochrome P450 Inhibition Using a Multitask Deep Autoencoder Neural Network.利用多任务深度自动编码器神经网络预测人细胞色素 P450 抑制作用。
Mol Pharm. 2018 Oct 1;15(10):4336-4345. doi: 10.1021/acs.molpharmaceut.8b00110. Epub 2018 May 30.
7
Novel Multitask Conditional Neural-Network Surrogate Models for Expensive Optimization.新型多任务条件神经网络代理模型在昂贵优化中的应用
IEEE Trans Cybern. 2022 May;52(5):3984-3997. doi: 10.1109/TCYB.2020.3014126. Epub 2022 May 19.
8
A Framework for Deep Multitask Learning With Multiparametric Magnetic Resonance Imaging for the Joint Prediction of Histological Characteristics in Breast Cancer.一种用于乳腺癌组织学特征联合预测的基于多参数磁共振成像的深度多任务学习框架。
IEEE J Biomed Health Inform. 2022 Aug;26(8):3884-3895. doi: 10.1109/JBHI.2022.3179014. Epub 2022 Aug 11.
9
A multitask learning model for online pattern recognition.一种用于在线模式识别的多任务学习模型。
IEEE Trans Neural Netw. 2009 Mar;20(3):430-45. doi: 10.1109/TNN.2008.2007961. Epub 2009 Feb 2.
10
Commonality and Individuality-Based Subspace Learning.基于共性与个性的子空间学习
IEEE Trans Cybern. 2024 Mar;54(3):1456-1469. doi: 10.1109/TCYB.2022.3206064. Epub 2024 Feb 9.

本文引用的文献

1
Multitask learning for predicting pulmonary absorption of chemicals.多任务学习预测化学物质的肺部吸收。
Food Chem Toxicol. 2024 Mar;185:114453. doi: 10.1016/j.fct.2024.114453. Epub 2024 Jan 18.
2
The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods.2023 年的 ChEMBL 数据库:一个涵盖多种生物活性数据类型和时间段的药物发现平台。
Nucleic Acids Res. 2024 Jan 5;52(D1):D1180-D1192. doi: 10.1093/nar/gkad1004.
3
An Interpretable Multitask Framework BiLAT Enables Accurate Prediction of Cyclin-Dependent Protein Kinase Inhibitors.
可解释的多任务框架 BiLAT 可实现对细胞周期蛋白依赖性激酶抑制剂的准确预测。
J Chem Inf Model. 2023 Jun 12;63(11):3350-3368. doi: 10.1021/acs.jcim.3c00473. Epub 2023 May 12.
4
Accurate clinical toxicity prediction using multi-task deep neural nets and contrastive molecular explanations.利用多任务深度神经网络和对比分子解释进行准确的临床毒性预测。
Sci Rep. 2023 Mar 25;13(1):4908. doi: 10.1038/s41598-023-31169-8.
5
Using random forest to predict antimicrobial minimum inhibitory concentrations of nontyphoidal Salmonella in Taiwan.利用随机森林预测台湾非伤寒沙门氏菌的抗菌最低抑菌浓度。
Vet Res. 2023 Feb 6;54(1):11. doi: 10.1186/s13567-023-01141-5.
6
Co-model for chemical toxicity prediction based on multi-task deep learning.基于多任务深度学习的化学毒性预测协同模型。
Mol Inform. 2023 May;42(5):e2200257. doi: 10.1002/minf.202200257. Epub 2023 Mar 17.
7
Multitask Deep Neural Networks for Ames Mutagenicity Prediction.多任务深度神经网络在 Ames 致突变性预测中的应用。
J Chem Inf Model. 2022 Dec 26;62(24):6342-6351. doi: 10.1021/acs.jcim.2c00532. Epub 2022 Sep 6.
8
Global Analysis of Deep Learning Prediction Using Large-Scale In-House Kinome-Wide Profiling Data.使用大规模内部激酶组全谱分析数据进行深度学习预测的全局分析
ACS Omega. 2022 May 23;7(22):18374-18381. doi: 10.1021/acsomega.2c00664. eCollection 2022 Jun 7.
9
A Machine Learning Classifier for Predicting Stable MCI Patients Using Gene Biomarkers.基于基因标志物预测稳定轻度认知障碍患者的机器学习分类器。
Int J Environ Res Public Health. 2022 Apr 15;19(8):4839. doi: 10.3390/ijerph19084839.
10
Predicting Prenatal Developmental Toxicity Based On the Combination of Chemical Structures and Biological Data.基于化学结构和生物数据组合预测产前发育毒性。
Environ Sci Technol. 2022 May 3;56(9):5984-5998. doi: 10.1021/acs.est.2c01040. Epub 2022 Apr 22.