LinearAlifold：用于 RNA 比对的线性时间共识结构预测。

LinearAlifold: Linear-time consensus structure prediction for RNA alignments.

机构信息

School of EECS, Oregon State University, Corvallis, OR 97330, USA.

Dept. of Biochemistry & Biophysics, University of Rochester Medical Center, Rochester, NY 14642, USA; Center for RNA Biology, University of Rochester Medical Center, Rochester, NY 14642, USA; Dept. of Biostatistics & Computational Biology, University of Rochester Medical Center, Rochester, NY 14642, USA.

出版信息

J Mol Biol. 2024 Sep 1;436(17):168694. doi: 10.1016/j.jmb.2024.168694. Epub 2024 Jul 4.

DOI:10.1016/j.jmb.2024.168694

PMID:38971557

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11377157/

Abstract

Predicting the consensus structure of a set of aligned RNA homologs is a convenient method to find conserved structures in an RNA genome, which has many applications including viral diagnostics and therapeutics. However, the most commonly used tool for this task, RNAalifold, is prohibitively slow for long sequences, due to a cubic scaling with the sequence length, taking over a day on 400 SARS-CoV-2 and SARS-related genomes (∼30,000nt). We present LinearAlifold, a much faster alternative that scales linearly with both the sequence length and the number of sequences, based on our work LinearFold that folds a single RNA in linear time. Our work is orders of magnitude faster than RNAalifold (0.7 h on the above 400 genomes, or ∼36× speedup) and achieves higher accuracies when compared to a database of known structures. More interestingly, LinearAlifold's prediction on SARS-CoV-2 correlates well with experimentally determined structures, substantially outperforming RNAalifold. Finally, LinearAlifold supports two energy models (Vienna and BL*) and four modes: minimum free energy (MFE), maximum expected accuracy (MEA), ThreshKnot, and stochastic sampling, each of which takes under an hour for hundreds of SARS-CoV variants. Our resource is at: https://github.com/LinearFold/LinearAlifold (code) and http://linearfold.org/linear-alifold (server).

摘要

预测一组对齐的 RNA 同源物的共识结构是一种方便的方法，可以在 RNA 基因组中找到保守结构，这在病毒诊断和治疗等方面有许多应用。然而，用于此任务的最常用工具 RNAalifold 对于长序列来说速度非常慢，因为它的规模与序列长度呈立方关系，对于 400 个 SARS-CoV-2 和 SARS 相关基因组（约 30,000nt）来说，需要一天以上的时间。我们提出了 LinearAlifold，这是一种更快的替代方法，它基于我们的工作 LinearFold，该方法可以在线性时间内折叠单个 RNA，因此它的规模与序列长度和序列数量都呈线性关系。我们的工作比 RNAalifold 快几个数量级（在上述 400 个基因组上只需 0.7 小时，或 36 倍的加速），并且与已知结构数据库相比，具有更高的准确性。更有趣的是，LinearAlifold 对 SARS-CoV-2 的预测与实验确定的结构很好地相关，大大优于 RNAalifold。最后，LinearAlifold 支持两种能量模型（Vienna 和 BL*）和四种模式：最小自由能（MFE）、最大预期准确性（MEA）、ThreshKnot 和随机采样，对于数百种 SARS-CoV 变体，每种模式都在一小时内完成。我们的资源位于：https://github.com/LinearFold/LinearAlifold（代码）和 http://linearfold.org/linear-alifold（服务器）。

相似文献

LinearAlifold: Linear-time consensus structure prediction for RNA alignments.

J Mol Biol. 2024 Sep 1;436(17):168694. doi: 10.1016/j.jmb.2024.168694. Epub 2024 Jul 4.

LinearAlifold: Linear-Time Consensus Structure Prediction for RNA Alignments.

ArXiv. 2024 Jul 5:arXiv:2206.14794v2.

Prescription of Controlled Substances: Benefits and Risks

The effect of sample site and collection procedure on identification of SARS-CoV-2 infection.

Cochrane Database Syst Rev. 2024 Dec 16;12(12):CD014780. doi: 10.1002/14651858.CD014780.

EnsembleDesign: messenger RNA design minimizing ensemble free energy via probabilistic lattice parsing.

Bioinformatics. 2025 Jul 1;41(Supplement_1):i391-i400. doi: 10.1093/bioinformatics/btaf245.

Laboratory-based molecular test alternatives to RT-PCR for the diagnosis of SARS-CoV-2 infection.

Cochrane Database Syst Rev. 2024 Oct 14;10(10):CD015618. doi: 10.1002/14651858.CD015618.

Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.

Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

Rapid, point-of-care antigen tests for diagnosis of SARS-CoV-2 infection.

Cochrane Database Syst Rev. 2022 Jul 22;7(7):CD013705. doi: 10.1002/14651858.CD013705.pub3.

Antibody tests for identification of current and past infection with SARS-CoV-2.

Cochrane Database Syst Rev. 2022 Nov 17;11(11):CD013652. doi: 10.1002/14651858.CD013652.pub2.

Lessons learned: overcoming common challenges in reconstructing the SARS-CoV-2 genome from short-read sequencing data via CoVpipe2.

F1000Res. 2024 Apr 16;12:1091. doi: 10.12688/f1000research.136683.1. eCollection 2023.

引用本文的文献

ECSFinder: optimized prediction of evolutionarily conserved RNA secondary structures from genome sequences.

Nucleic Acids Res. 2025 Aug 11;53(15). doi: 10.1093/nar/gkaf780.

AlignmentFold and AlignmentPartition: Improving the align-then-fold approach for RNA secondary structure prediction.

bioRxiv. 2025 Jul 28:2025.07.23.666478. doi: 10.1101/2025.07.23.666478.

LinAliFold and CentroidLinAliFold: fast RNA consensus secondary structure prediction for aligned sequences using beam search methods.

Bioinform Adv. 2022 Oct 22;2(1):vbac078. doi: 10.1093/bioadv/vbac078. eCollection 2022.

本文引用的文献

LinearCoFold and LinearCoPartition: linear-time algorithms for secondary structure prediction of interacting RNA molecules.

Nucleic Acids Res. 2023 Oct 13;51(18):e94. doi: 10.1093/nar/gkad664.

LinAliFold and CentroidLinAliFold: fast RNA consensus secondary structure prediction for aligned sequences using beam search methods.

Bioinform Adv. 2022 Oct 22;2(1):vbac078. doi: 10.1093/bioadv/vbac078. eCollection 2022.

LazySampling and LinearSampling: fast stochastic sampling of RNA secondary structure with applications to SARS-CoV-2.

Nucleic Acids Res. 2023 Jan 25;51(2):e7. doi: 10.1093/nar/gkac1029.

LinearTurboFold: Linear-time global prediction of conserved structures for RNA homologs with applications to SARS-CoV-2.

Proc Natl Acad Sci U S A. 2021 Dec 28;118(52). doi: 10.1073/pnas.2116269118.

In vivo structural characterization of the SARS-CoV-2 RNA genome identifies host proteins vulnerable to repurposed drugs.

Cell. 2021 Apr 1;184(7):1865-1883.e20. doi: 10.1016/j.cell.2021.02.008. Epub 2021 Feb 9.

Comprehensive in vivo secondary structure of the SARS-CoV-2 genome reveals novel regulatory motifs and mechanisms.

Mol Cell. 2021 Feb 4;81(3):584-598.e5. doi: 10.1016/j.molcel.2020.12.041. Epub 2021 Jan 1.

The Short- and Long-Range RNA-RNA Interactome of SARS-CoV-2.

Mol Cell. 2020 Dec 17;80(6):1067-1077.e5. doi: 10.1016/j.molcel.2020.11.004. Epub 2020 Nov 5.

LinearPartition: linear-time approximation of RNA folding partition function and base-pairing probabilities.

Bioinformatics. 2020 Jul 1;36(Suppl_1):i258-i267. doi: 10.1093/bioinformatics/btaa460.

Data, disease and diplomacy: GISAID's innovative contribution to global health.

Glob Chall. 2017 Jan 10;1(1):33-46. doi: 10.1002/gch2.1018. eCollection 2017 Jan.

LinearFold: linear-time approximate RNA folding by 5'-to-3' dynamic programming and beam search.

Bioinformatics. 2019 Jul 15;35(14):i295-i304. doi: 10.1093/bioinformatics/btz375.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

LinearAlifold：用于 RNA 比对的线性时间共识结构预测。

LinearAlifold: Linear-time consensus structure prediction for RNA alignments.

机构信息

School of EECS, Oregon State University, Corvallis, OR 97330, USA.

出版信息

J Mol Biol. 2024 Sep 1;436(17):168694. doi: 10.1016/j.jmb.2024.168694. Epub 2024 Jul 4.

DOI:10.1016/j.jmb.2024.168694

PMID:38971557

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11377157/

Abstract

摘要

LinearAlifold：用于 RNA 比对的线性时间共识结构预测。

LinearAlifold: Linear-time consensus structure prediction for RNA alignments.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

LinearAlifold：用于 RNA 比对的线性时间共识结构预测。

LinearAlifold: Linear-time consensus structure prediction for RNA alignments.

机构信息

出版信息