• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用随机森林方法预测ATP结合盒转运蛋白

Predicting ATP-Binding Cassette Transporters Using the Random Forest Method.

作者信息

Hou Ruiyan, Wang Lida, Wu Yi-Jun

机构信息

Laboratory of Molecular Toxicology, State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, China.

College of Life Science, University of Chinese Academy of Sciences, Beijing, China.

出版信息

Front Genet. 2020 Mar 25;11:156. doi: 10.3389/fgene.2020.00156. eCollection 2020.

DOI:10.3389/fgene.2020.00156
PMID:32269586
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7109328/
Abstract

ATP-binding cassette (ABC) proteins play important roles in a wide variety of species. These proteins are involved in absorbing nutrients, exporting toxic substances, and regulating potassium channels, and they contribute to drug resistance in cancer cells. Therefore, the identification of ABC transporters is an urgent task. The present study used 188D as the feature extraction method, which is based on sequence information and physicochemical properties. We also visualized the feature extracted by t-Distributed Stochastic Neighbor Embedding (t-SNE). The sample based on the features extracted by 188D may be separated. Further, random forest (RF) is an efficient classifier to identify proteins. Under the 10-fold cross-validation of the model proposed here for a training set, the average accuracy rate of 10 training sets was 89.54%. We obtained values of 0.87 for specificity, 0.92 for sensitivity, and 0.79 for MCC. In the testing set, the accuracy achieved was 89%. These results suggest that the model combining 188D with RF is an optimal tool to identify ABC transporters.

摘要

ATP结合盒(ABC)蛋白在多种物种中发挥着重要作用。这些蛋白参与营养物质吸收、有毒物质输出以及钾通道调节,并且它们与癌细胞的耐药性有关。因此,鉴定ABC转运蛋白是一项紧迫的任务。本研究使用188D作为特征提取方法,该方法基于序列信息和物理化学性质。我们还通过t分布随机邻域嵌入(t-SNE)对提取的特征进行了可视化。基于188D提取的特征的样本可能会被分离。此外,随机森林(RF)是一种用于鉴定蛋白质的高效分类器。在此处提出的模型针对训练集的10倍交叉验证下,10个训练集的平均准确率为89.54%。我们得到的特异性值为0.87,灵敏度值为0.92,马修斯相关系数(MCC)值为0.79。在测试集中,实现的准确率为89%。这些结果表明,将188D与RF相结合的模型是鉴定ABC转运蛋白的最佳工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/96283a684f4b/fgene-11-00156-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/394648e72415/fgene-11-00156-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/9c80a9d336fd/fgene-11-00156-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/81d2c527f931/fgene-11-00156-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/d05ad454096a/fgene-11-00156-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/cbcf89be557e/fgene-11-00156-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/96283a684f4b/fgene-11-00156-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/394648e72415/fgene-11-00156-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/9c80a9d336fd/fgene-11-00156-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/81d2c527f931/fgene-11-00156-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/d05ad454096a/fgene-11-00156-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/cbcf89be557e/fgene-11-00156-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1315/7109328/96283a684f4b/fgene-11-00156-g006.jpg

相似文献

1
Predicting ATP-Binding Cassette Transporters Using the Random Forest Method.使用随机森林方法预测ATP结合盒转运蛋白
Front Genet. 2020 Mar 25;11:156. doi: 10.3389/fgene.2020.00156. eCollection 2020.
2
DeepRTCP: Predicting ATP-Binding Cassette Transporters Based on 1-Dimensional Convolutional Network.深度实时定量聚合酶链反应:基于一维卷积网络预测ATP结合盒转运蛋白
Front Cell Dev Biol. 2021 Feb 1;8:614080. doi: 10.3389/fcell.2020.614080. eCollection 2020.
3
CWLy-RF: A novel approach for identifying cell wall lyases based on random forest classifier.CWLy-RF:一种基于随机森林分类器识别细胞壁裂解酶的新方法。
Genomics. 2021 Sep;113(5):2919-2924. doi: 10.1016/j.ygeno.2021.06.038. Epub 2021 Jun 27.
4
Fault diagnosis method of self-validating metal oxide semiconductor gas sensor based on t-distribution stochastic neighbor embedding and random forest.基于t分布随机邻域嵌入和随机森林的自验证金属氧化物半导体气体传感器故障诊断方法
Rev Sci Instrum. 2019 May;90(5):055002. doi: 10.1063/1.5090142.
5
Prediction of antioxidant proteins using hybrid feature representation method and random forest.基于混合特征表示方法和随机森林的抗氧化蛋白预测。
Genomics. 2020 Nov;112(6):4666-4674. doi: 10.1016/j.ygeno.2020.08.016. Epub 2020 Aug 17.
6
Predicting Inhibitors for Multidrug Resistance Associated Protein-2 Transporter by Machine Learning Approach.通过机器学习方法预测多药耐药相关蛋白2转运体的抑制剂
Comb Chem High Throughput Screen. 2018;21(8):557-566. doi: 10.2174/1386207321666181024104822.
7
Gene expression profiling of ATP-binding cassette (ABC) transporters as a predictor of the pathologic response to neoadjuvant chemotherapy in breast cancer patients.ATP结合盒(ABC)转运蛋白的基因表达谱作为乳腺癌患者新辅助化疗病理反应的预测指标
Breast Cancer Res Treat. 2006 Sep;99(1):9-17. doi: 10.1007/s10549-006-9175-2. Epub 2006 Jun 5.
8
The motor domains of ABC-transporters. What can structures tell us?ABC转运蛋白的运动结构域。结构能告诉我们什么?
Naunyn Schmiedebergs Arch Pharmacol. 2006 Mar;372(6):385-99. doi: 10.1007/s00210-005-0031-4. Epub 2006 Mar 16.
9
Insect ATP-Binding Cassette (ABC) Transporters: Roles in Xenobiotic Detoxification and Bt Insecticidal Activity.昆虫三磷酸腺苷结合盒(ABC)转运蛋白:在解毒和苏云金芽孢杆菌杀虫活性中的作用。
Int J Mol Sci. 2019 Jun 10;20(11):2829. doi: 10.3390/ijms20112829.
10
ATP-binding cassette transporters in reproduction: a new frontier.生殖中的ATP结合盒转运蛋白:一个新领域。
Hum Reprod Update. 2016 Mar-Apr;22(2):164-81. doi: 10.1093/humupd/dmv049. Epub 2015 Nov 5.

引用本文的文献

1
Genomic structure of yellow lupin (Lupinus luteus): genome organization, evolution, gene family expansion, metabolites and protein synthesis.黄羽扇豆(Lupinus luteus)的基因组结构:基因组组织、进化、基因家族扩张、代谢产物与蛋白质合成
BMC Genomics. 2025 May 14;26(1):477. doi: 10.1186/s12864-025-11678-8.
2
Structural and biochemical insights of xylose MFS and SWEET transporters in microbial cell factories: challenges to lignocellulosic hydrolysates fermentation.微生物细胞工厂中木糖MFS和SWEET转运蛋白的结构与生化见解:木质纤维素水解物发酵面临的挑战
Front Microbiol. 2024 Sep 27;15:1452240. doi: 10.3389/fmicb.2024.1452240. eCollection 2024.
3

本文引用的文献

1
RNAm5CPred: Prediction of RNA 5-Methylcytosine Sites Based on Three Different Kinds of Nucleotide Composition.RNAm5CPred:基于三种不同核苷酸组成的RNA 5-甲基胞嘧啶位点预测
Mol Ther Nucleic Acids. 2019 Dec 6;18:739-747. doi: 10.1016/j.omtn.2019.10.008. Epub 2019 Oct 18.
2
Genome-wide identification of ABC transporters in monogeneans.单殖吸虫中ABC转运蛋白的全基因组鉴定
Mol Biochem Parasitol. 2019 Dec;234:111234. doi: 10.1016/j.molbiopara.2019.111234. Epub 2019 Nov 9.
3
DeepSVM-fold: protein fold recognition by combining support vector machines and pairwise sequence similarity scores generated by deep learning networks.
Aptamers Targeting Membrane Proteins for Sensor and Diagnostic Applications.
适体靶向膜蛋白的传感器和诊断应用。
Molecules. 2023 Apr 26;28(9):3728. doi: 10.3390/molecules28093728.
4
A GHKNN model based on the physicochemical property extraction method to identify SNARE proteins.一种基于物理化学性质提取方法的GHKNN模型,用于识别SNARE蛋白。
Front Genet. 2022 Nov 23;13:935717. doi: 10.3389/fgene.2022.935717. eCollection 2022.
5
Significance of a PTEN Mutational Status-Associated Gene Signature in the Progression and Prognosis of Endometrial Carcinoma.PTEN 基因突变相关基因特征在子宫内膜癌进展和预后中的意义。
Oxid Med Cell Longev. 2022 Feb 23;2022:5130648. doi: 10.1155/2022/5130648. eCollection 2022.
6
Impact of Non-Coding RNAs on Chemotherapeutic Resistance in Oral Cancer.非编码 RNA 对口腔癌化疗耐药性的影响。
Biomolecules. 2022 Feb 9;12(2):284. doi: 10.3390/biom12020284.
7
Prediction of prokaryotic transposases from protein features with machine learning approaches.基于机器学习方法的蛋白质特征预测原核转座酶。
Microb Genom. 2021 Jul;7(7). doi: 10.1099/mgen.0.000611.
8
DeepRTCP: Predicting ATP-Binding Cassette Transporters Based on 1-Dimensional Convolutional Network.深度实时定量聚合酶链反应:基于一维卷积网络预测ATP结合盒转运蛋白
Front Cell Dev Biol. 2021 Feb 1;8:614080. doi: 10.3389/fcell.2020.614080. eCollection 2020.
DeepSVM-fold:通过结合支持向量机和深度学习网络生成的成对序列相似性得分来进行蛋白质折叠识别。
Brief Bioinform. 2020 Sep 25;21(5):1733-1741. doi: 10.1093/bib/bbz098.
4
AOPs-SVM: A Sequence-Based Classifier of Antioxidant Proteins Using a Support Vector Machine.AOPs-SVM:一种基于序列的使用支持向量机的抗氧化蛋白分类器。
Front Bioeng Biotechnol. 2019 Sep 18;7:224. doi: 10.3389/fbioe.2019.00224. eCollection 2019.
5
Predicting disease-associated circular RNAs using deep forests combined with positive-unlabeled learning methods.使用深度森林结合正无标记学习方法预测疾病相关的环状 RNA。
Brief Bioinform. 2020 Jul 15;21(4):1425-1436. doi: 10.1093/bib/bbz080.
6
A comprehensive comparison and analysis of computational predictors for RNA N6-methyladenosine sites of Saccharomyces cerevisiae.全面比较和分析酿酒酵母 RNA N6-甲基腺苷位点的计算预测因子。
Brief Funct Genomics. 2019 Nov 19;18(6):367-376. doi: 10.1093/bfgp/elz018.
7
Prediction of CYP450 Enzyme-Substrate Selectivity Based on the Network-Based Label Space Division Method.基于网络标记空间划分方法预测 CYP450 酶底物选择性。
J Chem Inf Model. 2019 Nov 25;59(11):4577-4586. doi: 10.1021/acs.jcim.9b00749. Epub 2019 Oct 22.
8
Identifying enhancer-promoter interactions with neural network based on pre-trained DNA vectors and attention mechanism.基于预训练 DNA 向量和注意力机制的神经网络识别增强子-启动子相互作用。
Bioinformatics. 2020 Feb 15;36(4):1037-1043. doi: 10.1093/bioinformatics/btz694.
9
A Random Forest Sub-Golgi Protein Classifier Optimized via Dipeptide and Amino Acid Composition Features.一种通过二肽和氨基酸组成特征优化的随机森林亚高尔基体蛋白分类器。
Front Bioeng Biotechnol. 2019 Sep 4;7:215. doi: 10.3389/fbioe.2019.00215. eCollection 2019.
10
iPromoter-2L2.0: Identifying Promoters and Their Types by Combining Smoothing Cutting Window Algorithm and Sequence-Based Features.iPromoter-2L2.0:结合平滑切割窗口算法和基于序列的特征识别启动子及其类型
Mol Ther Nucleic Acids. 2019 Dec 6;18:80-87. doi: 10.1016/j.omtn.2019.08.008. Epub 2019 Aug 14.