• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

迈向使用变分自编码器设计的人工智能基因组。

Towards AI-designed genomes using a variational autoencoder.

作者信息

Dudek Natasha K, Precup Doina

机构信息

School of Computer Science, McGill University, Montreal, QC H3A 0G4, Canada.

Mila-Québec Artificial Intelligence Institute, Montreal, QC H2S 3H1, Canada.

出版信息

Proc Biol Sci. 2024 Dec;291(2036):20241457. doi: 10.1098/rspb.2024.1457. Epub 2024 Dec 11.

DOI:10.1098/rspb.2024.1457
PMID:39657811
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11631412/
Abstract

Genomes encode elaborate networks of genes whose products must seamlessly interact to support living organisms. Humans' capacity to understand these biological systems is limited by their sheer size and complexity. In this article, we develop a proof of concept framework for training a machine learning (ML) algorithm to model bacterial genome composition. To achieve this, we create simplified representations of genomes in the form of binary vectors that indicate the encoded genes, henceforth referred to as genome vectors. A denoising variational autoencoder was trained to accept corrupted genome vectors, in which most genes had been masked, and reconstruct the original. The resulting model, DeepGenomeVector, effectively captures complex dependencies in genomic networks, as evaluated by both qualitative and quantitative metrics. An in-depth functional analysis of a generated genome vector shows that its encoded pathways are interconnected, near complete, and ecologically cohesive. On the test set, where the model's ability to reconstruct uncorrupted genome vectors was evaluated, Area Under the Receiver Operating Curve (AUROC) and F1 scores of 0.98 and 0.83, respectively, support the model's strong performance. This article showcases the power of ML approaches for synthetic biology and highlights the possibility that artifical intelligence agents may one day be able to design genomes that animate carbon-based cells.

摘要

基因组编码了复杂的基因网络,其产物必须无缝相互作用以支持生物体。人类理解这些生物系统的能力受到其规模和复杂性的限制。在本文中,我们开发了一个概念验证框架,用于训练机器学习(ML)算法来模拟细菌基因组组成。为了实现这一点,我们以二进制向量的形式创建了基因组的简化表示,这些向量指示编码的基因,此后称为基因组向量。训练了一个去噪变分自编码器,以接受大多数基因已被掩盖的损坏基因组向量,并重建原始向量。通过定性和定量指标评估,所得模型DeepGenomeVector有效地捕获了基因组网络中的复杂依赖性。对生成的基因组向量进行的深入功能分析表明,其编码的途径相互连接、近乎完整且具有生态凝聚力。在测试集上,评估了模型重建未损坏基因组向量的能力,受试者工作特征曲线下面积(AUROC)和F1分数分别为0.98和0.83,支持了模型的强大性能。本文展示了ML方法在合成生物学中的力量,并强调了人工智能代理有朝一日可能能够设计出使碳基细胞有生命的基因组的可能性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3133/11631412/de47333b9910/rspb.2024.1457.f005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3133/11631412/659d9da22c21/rspb.2024.1457.f001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3133/11631412/5efc9494a4df/rspb.2024.1457.f002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3133/11631412/c5d1a1bc2f63/rspb.2024.1457.f003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3133/11631412/520c9bb04e30/rspb.2024.1457.f004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3133/11631412/de47333b9910/rspb.2024.1457.f005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3133/11631412/659d9da22c21/rspb.2024.1457.f001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3133/11631412/5efc9494a4df/rspb.2024.1457.f002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3133/11631412/c5d1a1bc2f63/rspb.2024.1457.f003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3133/11631412/520c9bb04e30/rspb.2024.1457.f004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3133/11631412/de47333b9910/rspb.2024.1457.f005.jpg

相似文献

1
Towards AI-designed genomes using a variational autoencoder.迈向使用变分自编码器设计的人工智能基因组。
Proc Biol Sci. 2024 Dec;291(2036):20241457. doi: 10.1098/rspb.2024.1457. Epub 2024 Dec 11.
2
Lifelong Generative Adversarial Autoencoder.终身生成对抗自动编码器。
IEEE Trans Neural Netw Learn Syst. 2024 Oct;35(10):14684-14698. doi: 10.1109/TNNLS.2023.3281091. Epub 2024 Oct 7.
3
Enhancing the Breast Histopathology Image Analysis for Cancer Detection Using Variational Autoencoder.基于变分自编码器的乳腺癌病理图像分析增强用于癌症检测。
Int J Environ Res Public Health. 2023 Feb 27;20(5):4244. doi: 10.3390/ijerph20054244.
4
Anomaly Detection in Asset Degradation Process Using Variational Autoencoder and Explanations.基于变分自编码器和解释的资产退化过程中的异常检测。
Sensors (Basel). 2021 Dec 31;22(1):291. doi: 10.3390/s22010291.
5
Energy Efficiency of Inference Algorithms for Clinical Laboratory Data Sets: Green Artificial Intelligence Study.临床实验室数据集推断算法的能效:绿色人工智能研究。
J Med Internet Res. 2022 Jan 25;24(1):e28036. doi: 10.2196/28036.
6
Can Generative AI Learn Physiological Waveform Morphologies? A Study on Denoising Intracardiac Signals in Ischemic Cardiomyopathy.生成式人工智能能否学习生理波形形态?一项关于缺血性心肌病心内信号去噪的研究。
Annu Int Conf IEEE Eng Med Biol Soc. 2024 Jul;2024:1-4. doi: 10.1109/EMBC53108.2024.10782966.
7
MRI-based mild cognitive impairment and Alzheimer's disease classification using an algorithm of combination of variational autoencoder and other machine learning classifiers.基于磁共振成像的轻度认知障碍和阿尔茨海默病分类:使用变分自编码器与其他机器学习分类器相结合的算法
J Alzheimers Dis Rep. 2024 Oct 18;8(1):1434-1452. doi: 10.1177/25424823241290694. eCollection 2024.
8
Artificial intelligence strategies based on random forests for detection of AI-generated content in public health.基于随机森林的人工智能策略用于检测公共卫生领域中人工智能生成的内容。
Public Health. 2025 May;242:382-387. doi: 10.1016/j.puhe.2025.03.029. Epub 2025 Apr 7.
9
Unsupervised Deep Learning based Variational Autoencoder Model for COVID-19 Diagnosis and Classification.基于无监督深度学习的变分自编码器模型用于COVID-19诊断与分类
Pattern Recognit Lett. 2021 Nov;151:267-274. doi: 10.1016/j.patrec.2021.08.018. Epub 2021 Sep 22.
10
Predicting drug polypharmacology from cell morphology readouts using variational autoencoder latent space arithmetic.基于变分自动编码器潜在空间算法从细胞形态读取结果预测药物多效性。
PLoS Comput Biol. 2022 Feb 25;18(2):e1009888. doi: 10.1371/journal.pcbi.1009888. eCollection 2022 Feb.

本文引用的文献

1
Model-directed generation of artificial CRISPR-Cas13a guide RNA sequences improves nucleic acid detection.模型导向的人工CRISPR-Cas13a引导RNA序列生成可改善核酸检测。
Nat Biotechnol. 2024 Oct 11. doi: 10.1038/s41587-024-02422-w.
2
ProT-Diff: A Modularized and Efficient Strategy for De Novo Generation of Antimicrobial Peptide Sequences by Integrating Protein Language and Diffusion Models.ProT-Diff:一种通过整合蛋白质语言模型和扩散模型从头生成抗菌肽序列的模块化高效策略。
Adv Sci (Weinh). 2024 Nov;11(43):e2406305. doi: 10.1002/advs.202406305. Epub 2024 Sep 25.
3
ACP-ESM: A novel framework for classification of anticancer peptides using protein-oriented transformer approach.
ACP-ESM:一种使用面向蛋白质的转换器方法进行抗癌肽分类的新框架。
Artif Intell Med. 2024 Oct;156:102951. doi: 10.1016/j.artmed.2024.102951. Epub 2024 Aug 20.
4
Design of target specific peptide inhibitors using generative deep learning and molecular dynamics simulations.使用生成式深度学习和分子动力学模拟设计靶向特定肽抑制剂。
Nat Commun. 2024 Feb 21;15(1):1611. doi: 10.1038/s41467-024-45766-2.
5
Autotrophic biofilms sustained by deeply sourced groundwater host diverse bacteria implicated in sulfur and hydrogen metabolism.由深层地下水源维持的自养生物膜中存在着多种与硫和氢代谢相关的细菌。
Microbiome. 2024 Jan 26;12(1):15. doi: 10.1186/s40168-023-01704-w.
6
Design of synthetic promoters for cyanobacteria with generative deep-learning model.基于生成式深度学习模型的蓝藻合成启动子设计。
Nucleic Acids Res. 2023 Jul 21;51(13):7071-7082. doi: 10.1093/nar/gkad451.
7
PERISCOPE-Opt: Machine learning-based prediction of optimal fermentation conditions and yields of recombinant periplasmic protein expressed in .潜望镜-Opt:基于机器学习预测在……中表达的重组周质蛋白的最佳发酵条件和产量 。 (你提供的原文似乎不完整,“expressed in”后面缺少具体内容)
Comput Struct Biotechnol J. 2022 Jun 3;20:2909-2920. doi: 10.1016/j.csbj.2022.06.006. eCollection 2022.
8
Predicting antibody binders and generating synthetic antibodies using deep learning.使用深度学习预测抗体结合物并生成合成抗体。
MAbs. 2022 Jan-Dec;14(1):2069075. doi: 10.1080/19420862.2022.2069075.
9
METABOLIC: high-throughput profiling of microbial genomes for functional traits, metabolism, biogeochemistry, and community-scale functional networks.代谢组学:高通量分析微生物基因组的功能特征、代谢、生物地球化学和群落尺度的功能网络。
Microbiome. 2022 Feb 16;10(1):33. doi: 10.1186/s40168-021-01213-8.
10
Machine learning-informed and synthetic biology-enabled semi-continuous algal cultivation to unleash renewable fuel productivity.机器学习指导和合成生物学赋能的半连续藻类培养以释放可再生燃料生产力。
Nat Commun. 2022 Jan 27;13(1):541. doi: 10.1038/s41467-021-27665-y.