Suppr超能文献

TACOS:一种用于准确预测细胞特异性长非编码 RNA 亚细胞定位的新方法。

TACOS: a novel approach for accurate prediction of cell-specific long noncoding RNAs subcellular localization.

机构信息

Department of Integrative Biotechnology, College of Bioengineering and Biotechnology, Sungkyunkwan University, Suwon 16419, Korea.

Tulane Center for Biomedical Informatics and Genomics, Division of Biomedical Informatics and Genomics, John W. Deming Department of Medicine, School of Medicine, Tulane University, New Orleans, LA 70112, USA.

出版信息

Brief Bioinform. 2022 Jul 18;23(4). doi: 10.1093/bib/bbac243.

Abstract

Long noncoding RNAs (lncRNAs) are primarily regulated by their cellular localization, which is responsible for their molecular functions, including cell cycle regulation and genome rearrangements. Accurately identifying the subcellular location of lncRNAs from sequence information is crucial for a better understanding of their biological functions and mechanisms. In contrast to traditional experimental methods, bioinformatics or computational methods can be applied for the annotation of lncRNA subcellular locations in humans more effectively. In the past, several machine learning-based methods have been developed to identify lncRNA subcellular localization, but relevant work for identifying cell-specific localization of human lncRNA remains limited. In this study, we present the first application of the tree-based stacking approach, TACOS, which allows users to identify the subcellular localization of human lncRNA in 10 different cell types. Specifically, we conducted comprehensive evaluations of six tree-based classifiers with 10 different feature descriptors, using a newly constructed balanced training dataset for each cell type. Subsequently, the strengths of the AdaBoost baseline models were integrated via a stacking approach, with an appropriate tree-based classifier for the final prediction. TACOS displayed consistent performance in both the cross-validation and independent assessments compared with the other two approaches employed in this study. The user-friendly online TACOS web server can be accessed at https://balalab-skku.org/TACOS.

摘要

长链非编码 RNA(lncRNA)主要受其细胞定位调控,这决定了它们的分子功能,包括细胞周期调控和基因组重排。准确地从序列信息中识别 lncRNA 的亚细胞位置对于更好地理解它们的生物学功能和机制至关重要。与传统的实验方法相比,生物信息学或计算方法可以更有效地应用于人类 lncRNA 亚细胞位置的注释。过去,已经开发了几种基于机器学习的方法来识别 lncRNA 亚细胞定位,但用于识别人类 lncRNA 细胞特异性定位的相关工作仍然有限。在这项研究中,我们首次应用基于树的堆叠方法 TACOS,该方法允许用户识别 10 种不同细胞类型中的人类 lncRNA 的亚细胞定位。具体来说,我们使用每个细胞类型的新构建的平衡训练数据集,对六种基于树的分类器和十种不同的特征描述符进行了全面评估。随后,通过堆叠方法整合了 AdaBoost 基线模型的优势,使用适当的基于树的分类器进行最终预测。与本研究中使用的另外两种方法相比,TACOS 在交叉验证和独立评估中的表现都一致。用户友好的在线 TACOS 网络服务器可在 https://balalab-skku.org/TACOS 上访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/77ca/9294414/9645c05c2bfd/bbac243f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验