• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

ABEILLE:一种基于机器学习的 RNA-seq 数据的异常表达识别新方法。

ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data.

机构信息

Université Côte d'Azur, Center of Modeling, Simulation and Interactions, Nice 06000, France.

Université Côte d'Azur, Inserm U1081, CNRS UMR 7284, Institute for Research on Cancer and Aging, Nice (IRCAN), Centre Hospitalier Universitaire (CHU) de Nice, Nice 06200, France.

出版信息

Bioinformatics. 2022 Oct 14;38(20):4754-4761. doi: 10.1093/bioinformatics/btac603.

DOI:10.1093/bioinformatics/btac603
PMID:36063052
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9563686/
Abstract

MOTIVATION

Current advances in omics technologies are paving the diagnosis of rare diseases proposing a complementary assay to identify the responsible gene. The use of transcriptomic data to identify aberrant gene expression (AGE) has demonstrated to yield potential pathogenic events. However, popular approaches for AGE identification are limited by the use of statistical tests that imply the choice of arbitrary cut-off for significance assessment and the availability of several replicates not always possible in clinical contexts.

RESULTS

Hence, we developed ABerrant Expression Identification empLoying machine LEarning from sequencing data (ABEILLE) a variational autoencoder (VAE)-based method for the identification of AGEs from the analysis of RNA-seq data without the need for replicates or a control group. ABEILLE combines the use of a VAE, able to model any data without specific assumptions on their distribution, and a decision tree to classify genes as AGE or non-AGE. An anomaly score is associated with each gene in order to stratify AGE by the severity of aberration. We tested ABEILLE on a semi-synthetic and an experimental dataset demonstrating the importance of the flexibility of the VAE configuration to identify potential pathogenic candidates.

AVAILABILITY AND IMPLEMENTATION

ABEILLE source code is freely available at: https://github.com/UCA-MSI/ABEILLE.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

目前组学技术的进步为罕见病的诊断铺平了道路,提出了一种补充检测方法来鉴定致病基因。利用转录组数据来识别异常基因表达(AGE)已经证明可以产生潜在的致病事件。然而,AGE 识别的常用方法受到统计检验的限制,这些检验需要选择任意的显著水平截断值,并且在临床环境中并不总是能够获得多个重复样本。

结果

因此,我们开发了一种基于变分自动编码器(VAE)的方法,称为使用测序数据进行异常表达识别的 ABerrant Expression identification empLoying machine LEarning(ABEILLE),用于从 RNA-seq 数据的分析中识别 AGE,而无需重复样本或对照组。ABEILLE 结合了 VAE 的使用,VAE 能够在没有关于其分布的特定假设的情况下对任何数据进行建模,以及决策树来对基因进行分类为 AGE 或非 AGE。为了根据异常程度对 AGE 进行分层,为每个基因分配了一个异常得分。我们在一个半合成和一个实验数据集上测试了 ABEILLE,证明了 VAE 配置的灵活性对于识别潜在的致病候选物非常重要。

可用性和实现

ABEILLE 的源代码可在 https://github.com/UCA-MSI/ABEILLE 上免费获得。

补充信息

补充数据可在生物信息学在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f2a/9563686/124f0c339766/btac603f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f2a/9563686/26ae68857685/btac603f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f2a/9563686/18c89f27f350/btac603f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f2a/9563686/401afe414233/btac603f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f2a/9563686/124f0c339766/btac603f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f2a/9563686/26ae68857685/btac603f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f2a/9563686/18c89f27f350/btac603f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f2a/9563686/401afe414233/btac603f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f2a/9563686/124f0c339766/btac603f4.jpg

相似文献

1
ABEILLE: a novel method for ABerrant Expression Identification empLoying machine LEarning from RNA-sequencing data.ABEILLE:一种基于机器学习的 RNA-seq 数据的异常表达识别新方法。
Bioinformatics. 2022 Oct 14;38(20):4754-4761. doi: 10.1093/bioinformatics/btac603.
2
Interpretable factor models of single-cell RNA-seq via variational autoencoders.基于变分自动编码器的单细胞 RNA-seq 可解释因子模型。
Bioinformatics. 2020 Jun 1;36(11):3418-3421. doi: 10.1093/bioinformatics/btaa169.
3
Dr.VAE: improving drug response prediction via modeling of drug perturbation effects.VAE 博士:通过建模药物干扰效应来改善药物反应预测。
Bioinformatics. 2019 Oct 1;35(19):3743-3751. doi: 10.1093/bioinformatics/btz158.
4
scVAE: variational auto-encoders for single-cell gene expression data.scVAE:用于单细胞基因表达数据的变分自动编码器。
Bioinformatics. 2020 Aug 15;36(16):4415-4422. doi: 10.1093/bioinformatics/btaa293.
5
Prediction of mutation effects using a deep temporal convolutional network.使用深度时间卷积网络预测突变效应。
Bioinformatics. 2020 Apr 1;36(7):2047-2052. doi: 10.1093/bioinformatics/btz873.
6
EnImpute: imputing dropout events in single-cell RNA-sequencing data via ensemble learning.EnImpute:通过集成学习在单细胞 RNA 测序数据中推断缺失事件。
Bioinformatics. 2019 Nov 1;35(22):4827-4829. doi: 10.1093/bioinformatics/btz435.
7
SPsimSeq: semi-parametric simulation of bulk and single-cell RNA-sequencing data.SPsimSeq:批量和单细胞 RNA-seq 数据的半参数模拟。
Bioinformatics. 2020 May 1;36(10):3276-3278. doi: 10.1093/bioinformatics/btaa105.
8
NVT: a fast and simple tool for the assessment of RNA-seq normalization strategies.NVT:一种用于评估RNA测序标准化策略的快速简便工具。
Bioinformatics. 2016 Dec 1;32(23):3682-3684. doi: 10.1093/bioinformatics/btw521. Epub 2016 Aug 11.
9
AltHapAlignR: improved accuracy of RNA-seq analyses through the use of alternative haplotypes.AltHapAlignR:通过使用替代单倍型提高 RNA-seq 分析的准确性。
Bioinformatics. 2018 Jul 15;34(14):2401-2408. doi: 10.1093/bioinformatics/bty125.
10
Learning discriminative and structural samples for rare cell types with deep generative model.利用深度生成模型学习罕见细胞类型的判别和结构样本。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac317.

引用本文的文献

1
Challenges and opportunities to bridge translational to clinical research for personalized mitochondrial medicine.实现个性化线粒体医学转化到临床研究的挑战和机遇。
Neurotherapeutics. 2024 Jan;21(1):e00311. doi: 10.1016/j.neurot.2023.e00311. Epub 2024 Jan 19.

本文引用的文献

1
Exaggerated false positives by popular differential expression methods when analyzing human population samples.分析人类群体样本时,常用差异表达方法会导致假阳性结果夸大。
Genome Biol. 2022 Mar 15;23(1):79. doi: 10.1186/s13059-022-02648-4.
2
Detection of aberrant splicing events in RNA-seq data using FRASER.使用 FRASER 检测 RNA-seq 数据中的异常剪接事件。
Nat Commun. 2021 Jan 22;12(1):529. doi: 10.1038/s41467-020-20573-7.
3
Multi-Omics Approaches to Improve Mitochondrial Disease Diagnosis: Challenges, Advances, and Perspectives.
改善线粒体疾病诊断的多组学方法:挑战、进展与展望
Front Mol Biosci. 2020 Nov 2;7:590842. doi: 10.3389/fmolb.2020.590842. eCollection 2020.
4
LeafCutterMD: an algorithm for outlier splicing detection in rare diseases.叶切 MD:一种用于罕见病中异常剪接检测的算法。
Bioinformatics. 2020 Nov 1;36(17):4609-4615. doi: 10.1093/bioinformatics/btaa259.
5
Diagnostic utility of transcriptome sequencing for rare Mendelian diseases.转录组测序对罕见孟德尔疾病的诊断效用。
Genet Med. 2020 Mar;22(3):490-499. doi: 10.1038/s41436-019-0672-1. Epub 2019 Oct 14.
6
Genetic regulatory variation in populations informs transcriptome analysis in rare disease.群体遗传调控变异为罕见病的转录组分析提供信息。
Science. 2019 Oct 18;366(6463):351-356. doi: 10.1126/science.aay0256. Epub 2019 Oct 10.
7
Identification of rare-disease genes using blood transcriptome sequencing and large control cohorts.利用血液转录组测序和大型对照队列鉴定罕见病基因。
Nat Med. 2019 Jun;25(6):911-919. doi: 10.1038/s41591-019-0457-8. Epub 2019 Jun 3.
8
Expanding the Boundaries of RNA Sequencing as a Diagnostic Tool for Rare Mendelian Disease.将 RNA 测序扩展为罕见孟德尔疾病诊断工具的界限。
Am J Hum Genet. 2019 Mar 7;104(3):466-483. doi: 10.1016/j.ajhg.2019.01.012. Epub 2019 Feb 28.
9
OUTRIDER: A Statistical Method for Detecting Aberrantly Expressed Genes in RNA Sequencing Data.奥特赖德:一种在 RNA 测序数据中检测异常表达基因的统计方法。
Am J Hum Genet. 2018 Dec 6;103(6):907-917. doi: 10.1016/j.ajhg.2018.10.025. Epub 2018 Nov 29.
10
Mitochondrial medicine in the omics era.组学时代的线粒体医学。
Lancet. 2018 Jun 23;391(10139):2560-2574. doi: 10.1016/S0140-6736(18)30727-X. Epub 2018 Jun 18.