Suppr超能文献

基于序列的卷积神经网络对甲型流感病毒唾液酸结合偏好的预测研究

Examining the Influenza A Virus Sialic Acid Binding Preference Predictions of a Sequence-Based Convolutional Neural Network.

作者信息

Borkenhagen Laura K, Runstadler Jonathan A

机构信息

Department of Infectious Disease and Global Health, Cummings School of Veterinary Medicine, Tufts University, North Grafton, Massachusetts, USA.

出版信息

Influenza Other Respir Viruses. 2024 Dec;18(12):e70044. doi: 10.1111/irv.70044.

Abstract

BACKGROUND

Though receptor binding specificity is well established as a contributor to host tropism and spillover potential of influenza A viruses, determining receptor binding preference of a specific virus still requires expensive and time-consuming laboratory analyses. In this study, we pilot a machine learning approach for prediction of binding preference.

METHODS

We trained a convolutional neural network to predict the α2,6-linked sialic acid preference of influenza A viruses given the hemagglutinin amino acid sequence. The model was evaluated with an independent test dataset to assess the standard performance metrics, the impact of missing data in the test sequences, and the prediction performance on novel subtypes. Further, features found to be important to the generation of predictions were tested via targeted mutagenesis of H9 and H16 proteins expressed on pseudoviruses.

RESULTS

The final model developed in this study produced predictions on a test dataset correctly 94% of the time and an area under the receiver operating characteristic curve of 0.93. The model tolerated about 10% missing test data without compromising accurate prediction performance. Predictions on novel subtypes revealed that the model can extrapolate feature relationships between subtypes when generating binding predictions. Finally, evaluation of the features important for model predictions helped identify positions that alter the sialic acid conformation preference of hemagglutinin proteins in practice.

CONCLUSIONS

Ultimately, our results provide support to this in silico approach to hemagglutinin receptor binding preference prediction. This work emphasizes the need for ongoing research efforts to produce tools that may aid future pandemic risk assessment.

摘要

背景

尽管受体结合特异性已被充分确认为甲型流感病毒宿主嗜性和溢出潜力的一个促成因素,但确定特定病毒的受体结合偏好仍需要昂贵且耗时的实验室分析。在本研究中,我们试用了一种机器学习方法来预测结合偏好。

方法

我们训练了一个卷积神经网络,根据血凝素氨基酸序列预测甲型流感病毒对α2,6连接唾液酸的偏好。使用一个独立测试数据集对该模型进行评估,以评估标准性能指标、测试序列中缺失数据的影响以及对新型亚型的预测性能。此外,通过对假病毒上表达的H9和H16蛋白进行靶向诱变,测试了对预测生成很重要的特征。

结果

本研究开发的最终模型在测试数据集上的预测正确率为94%,受试者工作特征曲线下面积为0.93。该模型能够容忍约10%的缺失测试数据,而不影响准确的预测性能。对新型亚型的预测表明,该模型在生成结合预测时可以推断亚型之间的特征关系。最后,对模型预测重要特征的评估有助于确定在实际中改变血凝素蛋白唾液酸构象偏好的位置。

结论

最终,我们的结果为这种预测血凝素受体结合偏好的计算机方法提供了支持。这项工作强调了持续开展研究以开发有助于未来大流行风险评估工具的必要性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/db50/11634464/5fa87c31f851/IRV-18-e70044-g006.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验