Suppr超能文献

BioAutoMATED:一个用于解释和设计生物序列的端到端自动化机器学习工具。

BioAutoMATED: An end-to-end automated machine learning tool for explanation and design of biological sequences.

机构信息

Department of Biological Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA.

Institute for Medical Engineering and Science, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA; Wyss Institute for Biologically Inspired Engineering, Harvard University, Boston, MA 02115, USA; Department of Mechanical Engineering, Massachusetts Institute of Technology, 77 Massachusetts Ave, Cambridge, MA 02139, USA.

出版信息

Cell Syst. 2023 Jun 21;14(6):525-542.e9. doi: 10.1016/j.cels.2023.05.007.

Abstract

The design choices underlying machine-learning (ML) models present important barriers to entry for many biologists who aim to incorporate ML in their research. Automated machine-learning (AutoML) algorithms can address many challenges that come with applying ML to the life sciences. However, these algorithms are rarely used in systems and synthetic biology studies because they typically do not explicitly handle biological sequences (e.g., nucleotide, amino acid, or glycan sequences) and cannot be easily compared with other AutoML algorithms. Here, we present BioAutoMATED, an AutoML platform for biological sequence analysis that integrates multiple AutoML methods into a unified framework. Users are automatically provided with relevant techniques for analyzing, interpreting, and designing biological sequences. BioAutoMATED predicts gene regulation, peptide-drug interactions, and glycan annotation, and designs optimized synthetic biology components, revealing salient sequence characteristics. By automating sequence modeling, BioAutoMATED allows life scientists to incorporate ML more readily into their work.

摘要

机器学习(ML)模型的设计选择对许多旨在将 ML 纳入其研究的生物学家来说是一个重要的进入障碍。自动化机器学习(AutoML)算法可以解决将 ML 应用于生命科学所面临的许多挑战。然而,这些算法在系统和合成生物学研究中很少使用,因为它们通常不能明确处理生物序列(例如,核苷酸、氨基酸或聚糖序列),并且不能与其他 AutoML 算法轻易进行比较。在这里,我们提出了 BioAutoMATED,这是一个用于生物序列分析的 AutoML 平台,它将多种 AutoML 方法集成到一个统一的框架中。用户可以自动获得用于分析、解释和设计生物序列的相关技术。BioAutoMATED 可以预测基因调控、肽-药物相互作用和聚糖注释,并设计优化的合成生物学组件,揭示出显著的序列特征。通过自动进行序列建模,BioAutoMATED 使生命科学家更容易将 ML 融入到他们的工作中。

相似文献

1
4
Automated machine learning: Review of the state-of-the-art and opportunities for healthcare.
Artif Intell Med. 2020 Apr;104:101822. doi: 10.1016/j.artmed.2020.101822. Epub 2020 Feb 21.
5
Automated Machine Learning and Explainable AI (AutoML-XAI) for Metabolomics: Improving Cancer Diagnostics.
J Am Soc Mass Spectrom. 2024 Jun 5;35(6):1089-1100. doi: 10.1021/jasms.3c00403. Epub 2024 May 1.
7
Human behavior in image-based Road Health Inspection Systems despite the emerging AutoML.
J Big Data. 2022;9(1):96. doi: 10.1186/s40537-022-00646-8. Epub 2022 Jul 20.

引用本文的文献

1
Advancing cell therapies with artificial intelligence and synthetic biology.
Curr Opin Biomed Eng. 2025 Jun;34. doi: 10.1016/j.cobme.2025.100580. Epub 2025 Feb 3.
2
Exploring structure-function relationships in engineered receptor performance using computational structure prediction.
GEN Biotechnol. 2025 Feb;4(1):37-55. doi: 10.1089/genbio.2024.0057. Epub 2025 Feb 17.
3
Automated machine learning for classification and regression: A tutorial for psychologists.
Behav Res Methods. 2025 Aug 18;57(9):262. doi: 10.3758/s13428-025-02684-5.
4
Generative and predictive neural networks for the design of functional RNA molecules.
Nat Commun. 2025 May 4;16(1):4155. doi: 10.1038/s41467-025-59389-8.
5
AI in SERS sensing moving from discriminative to generative.
NPJ Biosens. 2025;2(1):9. doi: 10.1038/s44328-025-00033-2. Epub 2025 Feb 21.
6
Machine learning for antimicrobial peptide identification and design.
Nat Rev Bioeng. 2024 May;2(5):392-407. doi: 10.1038/s44222-024-00152-x. Epub 2024 Feb 26.
7
Mathematical basis and toolchain for hierarchical optimization of biochemical networks.
PLoS Comput Biol. 2024 Dec 2;20(12):e1012624. doi: 10.1371/journal.pcbi.1012624. eCollection 2024 Dec.
8
AutoXAI4Omics: an automated explainable AI tool for omics and tabular data.
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae593.
10
Current computational tools for protein lysine acylation site prediction.
Brief Bioinform. 2024 Sep 23;25(6). doi: 10.1093/bib/bbae469.

本文引用的文献

1
US-align: universal structure alignments of proteins, nucleic acids, and macromolecular complexes.
Nat Methods. 2022 Sep;19(9):1109-1115. doi: 10.1038/s41592-022-01585-1. Epub 2022 Aug 29.
2
Benchmarking AutoML frameworks for disease prediction using medical claims.
BioData Min. 2022 Jul 26;15(1):15. doi: 10.1186/s13040-022-00300-2.
3
Deep generative models for peptide design.
Digit Discov. 2022 Mar 31;1(3):195-208. doi: 10.1039/d1dd00024a. eCollection 2022 Jun 13.
5
Just Add Data: automated predictive modeling for knowledge discovery and feature selection.
NPJ Precis Oncol. 2022 Jun 16;6(1):38. doi: 10.1038/s41698-022-00274-8.
6
Neural networks to learn protein sequence-function relationships from deep mutational scanning data.
Proc Natl Acad Sci U S A. 2021 Nov 30;118(48). doi: 10.1073/pnas.2104878118.
7
8
Deep representation learning improves prediction of LacI-mediated transcriptional repression.
Proc Natl Acad Sci U S A. 2021 Jul 6;118(27). doi: 10.1073/pnas.2022838118.
10
Deep diversification of an AAV capsid protein by machine learning.
Nat Biotechnol. 2021 Jun;39(6):691-696. doi: 10.1038/s41587-020-00793-4. Epub 2021 Feb 11.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验