用于学习如何运行转录本组装程序的数据驱动型人工智能系统。

Data-driven AI system for learning how to run transcript assemblers.

作者信息

Shen Yihang, Yan Zhiwen, Kingsford Carl

机构信息

Ray and Stephanie Lane Computational Biology Department, School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA.

出版信息

bioRxiv. 2024 Oct 30:2024.01.25.577290. doi: 10.1101/2024.01.25.577290.

DOI:10.1101/2024.01.25.577290

PMID:39554123

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11565938/

Abstract

We introduce AutoTuneX, a data-driven, AI system designed to automatically predict optimal parameters for transcript assemblers - tools for reconstructing expressed transcripts from the reads in a given RNA-seq sample. AutoTuneX is built by learning parameter knowledge from existing RNA-seq samples and transferring this knowledge to unseen samples. On 1588 human RNA-seq samples tested with two transcript assemblers, AutoTuneX predicts parameters that resulted in 98% of samples achieving more accurate transcript assembly compared to using default parameter settings, with some samples experiencing up to a 600% improvement in AUC. AutoTuneX offers a new strategy for automatically optimizing use of sequence analysis tools.

摘要

我们推出了AutoTuneX，这是一个数据驱动的人工智能系统，旨在自动预测转录本组装工具的最佳参数，转录本组装工具用于从给定RNA测序样本中的 reads 重建表达的转录本。AutoTuneX通过从现有的RNA测序样本中学习参数知识，并将这些知识转移到未见样本中构建而成。在用两种转录本组装工具测试的1588个人类RNA测序样本上，与使用默认参数设置相比，AutoTuneX预测的参数使得98%的样本实现了更准确的转录本组装，一些样本的AUC提高了600%。AutoTuneX为自动优化序列分析工具的使用提供了一种新策略。