Creux Constance, Zehraoui Farida, Radvanyi François, Tahi Fariza
Université Paris-Saclay, Univ Evry, IBISC, Evry-Courcouronnes 91020, France.
Molecular Oncology, PSL Research University, CNRS, UMR 144, Institut Curie, Paris 75248, France.
Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf051.
As the biological roles and disease implications of non-coding RNAs continue to emerge, the need to thoroughly characterize previously unexplored non-coding RNAs becomes increasingly urgent. These molecules hold potential as biomarkers and therapeutic targets. However, the vast and complex nature of non-coding RNAs data presents a challenge. We introduce MMnc, an interpretable deep-learning approach designed to classify non-coding RNAs into functional groups. MMnc leverages multiple data sources-such as the sequence, secondary structure, and expression-using attention-based multi-modal data integration. This ensures the learning of meaningful representations while accounting for missing sources in some samples.
Our findings demonstrate that MMnc achieves high classification accuracy across diverse non-coding RNA classes. The method's modular architecture allows for the consideration of multiple types of modalities, whereas other tools only consider one or two at most. MMnc is resilient to missing data, ensuring that all available information is effectively utilized. Importantly, the generated attention scores offer interpretable insights into the underlying patterns of the different non-coding RNA classes, potentially driving future non-coding RNA research and applications.
Data and source code can be found at EvryRNA.ibisc.univ-evry.fr/EvryRNA/MMnc.
随着非编码RNA的生物学作用和疾病关联不断显现,全面表征此前未被探索的非编码RNA的需求变得愈发迫切。这些分子具有作为生物标志物和治疗靶点的潜力。然而,非编码RNA数据的庞大和复杂性带来了挑战。我们引入了MMnc,这是一种可解释的深度学习方法,旨在将非编码RNA分类为功能组。MMnc利用多种数据源,如序列、二级结构和表达,采用基于注意力的多模态数据整合。这确保了在考虑某些样本中缺失数据源的情况下学习有意义的表示。
我们的研究结果表明,MMnc在不同的非编码RNA类别中实现了高分类准确率。该方法的模块化架构允许考虑多种类型的模态,而其他工具最多只考虑一两种。MMnc对缺失数据具有弹性,确保有效利用所有可用信息。重要的是,生成的注意力分数为不同非编码RNA类别的潜在模式提供了可解释的见解,可能推动未来的非编码RNA研究和应用。
数据和源代码可在EvryRNA.ibisc.univ-evry.fr/EvryRNA/MMnc上找到。