• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Benchmarking and Testing Machine Learning Approaches with BARRA:CuRDa, a for Cancer Research.

作者信息

Feltes Bruno César, Poloni Joice De Faria, Dorn Márcio

机构信息

Institute of Informatics, Department of Theoretical Computer Science, Federal University of Rio Grande do Sul, Porto Alegre, Brazil.

Institute of Biosciences, Department of Biophysics, Federal University of Rio Grande do Sul, Porto Alegre, Brazil.

出版信息

J Comput Biol. 2021 Sep;28(9):931-944. doi: 10.1089/cmb.2020.0463. Epub 2021 Jul 14.

DOI:10.1089/cmb.2020.0463
PMID:34264745
Abstract

RNA-seq is gradually becoming the dominating technique employed to access the global gene expression in biological samples, allowing more flexible protocols and robust analysis. However, the nature of RNA-seq results imposes new data-handling challenges when it comes to computational analysis. With the increasing employment of machine learning (ML) techniques in biomedical sciences, databases that could provide curated data sets treated with state-of-the-art approaches already adapted to ML protocols, become essential for testing new algorithms. In this study, we present the Benchmarking of ARtificial intelligence Research: Curated RNA-seq Database (BARRA:CuRDa). BARRA:CuRDa was built exclusively for cancer research and is composed of 17 handpicked RNA-seq data sets for Homo sapiens that were gathered from the Gene Expression Omnibus, using rigorous filtering criteria. All data sets were individually submitted to sample quality analysis, removal of low-quality bases and artifacts from the experimental process, removal of ribosomal RNA, and estimation of transcript-level abundance. Moreover, all data sets were tested using standard approaches in the field, which allows them to be used as benchmark to new ML approaches. A feature selection analysis was also performed on each data set to investigate the biological accuracy of basic techniques. Results include genes already related to their specific tumoral tissue a large amount of long noncoding RNA and pseudogenes. BARRA:CuRDa is available at http://sbcb.inf.ufrgs.br/barracurda.

摘要

相似文献

1
Benchmarking and Testing Machine Learning Approaches with BARRA:CuRDa, a for Cancer Research.
J Comput Biol. 2021 Sep;28(9):931-944. doi: 10.1089/cmb.2020.0463. Epub 2021 Jul 14.
2
CuMiDa: An Extensively Curated Microarray Database for Benchmarking and Testing of Machine Learning Approaches in Cancer Research.CuMiDa:一个经过广泛整理的微阵列数据库,用于癌症研究中机器学习方法的基准测试和验证。
J Comput Biol. 2019 Apr;26(4):376-386. doi: 10.1089/cmb.2018.0238. Epub 2019 Feb 21.
3
PanClassif: Improving pan cancer classification of single cell RNA-seq gene expression data using machine learning.PanClassif:使用机器学习改进单细胞RNA测序基因表达数据的泛癌分类
Genomics. 2022 Mar;114(2):110264. doi: 10.1016/j.ygeno.2022.01.001. Epub 2022 Jan 6.
4
RNA-seq assistant: machine learning based methods to identify more transcriptional regulated genes.RNA-seq 辅助工具:基于机器学习的方法,以鉴定更多受转录调控的基因。
BMC Genomics. 2018 Jul 20;19(1):546. doi: 10.1186/s12864-018-4932-2.
5
How does normalization impact RNA-seq disease diagnosis?归一化如何影响 RNA-seq 疾病诊断?
J Biomed Inform. 2018 Sep;85:80-92. doi: 10.1016/j.jbi.2018.07.016. Epub 2018 Jul 21.
6
Assessment of Gene Set Enrichment Analysis using curated RNA-seq-based benchmarks.基于 RNA-seq 验证集的基因集富集分析评估。
PLoS One. 2024 May 16;19(5):e0302696. doi: 10.1371/journal.pone.0302696. eCollection 2024.
7
A scoping review on deep learning for next-generation RNA-Seq. data analysis.深度学习在下一代 RNA-Seq 数据分析中的应用综述
Funct Integr Genomics. 2023 Apr 21;23(2):134. doi: 10.1007/s10142-023-01064-6.
8
Biological classification with RNA-seq data: Can alternatively spliced transcript expression enhance machine learning classifiers?基于 RNA-seq 数据的生物学分类:剪接转录本表达能否增强机器学习分类器?
RNA. 2018 Sep;24(9):1119-1132. doi: 10.1261/rna.062802.117. Epub 2018 Jun 25.
9
GeneSelectML: a comprehensive way of gene selection for RNA-Seq data via machine learning algorithms.基因选择机器学习方法(GeneSelectML):一种通过机器学习算法对RNA测序数据进行基因选择的综合方法。
Med Biol Eng Comput. 2023 Jan;61(1):229-241. doi: 10.1007/s11517-022-02695-w. Epub 2022 Nov 10.
10
RNA-Seq Atlas--a reference database for gene expression profiling in normal tissue by next-generation sequencing.RNA-Seq 图谱——一个通过下一代测序对正常组织中的基因表达进行分析的参考数据库。
Bioinformatics. 2012 Apr 15;28(8):1184-5. doi: 10.1093/bioinformatics/bts084. Epub 2012 Feb 17.

引用本文的文献

1
Charting the transcriptomic landscape of primary and metastatic cancers in relation to their origin and target normal tissues.绘制原发性和转移性癌症与其起源和目标正常组织相关的转录组图谱。
Sci Adv. 2024 Dec 6;10(49):eadn0220. doi: 10.1126/sciadv.adn0220.
2
Editorial: Computational and integrative approaches for developmental biology and molecular evolution.社论:发育生物学与分子进化的计算和整合方法
Front Genet. 2023 Jul 14;14:1252328. doi: 10.3389/fgene.2023.1252328. eCollection 2023.
3
Synthetic lethal gene pairs: Experimental approaches and predictive models.
合成致死基因对:实验方法与预测模型。
Front Genet. 2022 Dec 1;13:961611. doi: 10.3389/fgene.2022.961611. eCollection 2022.
4
Comparison of machine learning techniques to handle imbalanced COVID-19 CBC datasets.用于处理不均衡的COVID-19全血细胞计数数据集的机器学习技术比较。
PeerJ Comput Sci. 2021 Aug 12;7:e670. doi: 10.7717/peerj-cs.670. eCollection 2021.