IEEE Comput Graph Appl. 2024 May-Jun;44(3):114-125. doi: 10.1109/MCG.2023.3345742. Epub 2024 Jun 21.
This article presents a visual analytics framework, idMotif, to support domain experts in identifying motifs in protein sequences. A motif is a short sequence of amino acids usually associated with distinct functions of a protein, and identifying similar motifs in protein sequences helps us to predict certain types of disease or infection. idMotif can be used to explore, analyze, and visualize such motifs in protein sequences. We introduce a deep-learning-based method for grouping protein sequences and allow users to discover motif candidates of protein groups based on local explanations of the decision of a deep-learning model. idMotif provides several interactive linked views for between and within protein cluster/group and sequence analysis. Through a case study and experts' feedback, we demonstrate how the framework helps domain experts analyze protein sequences and motif identification.
本文提出了一个可视化分析框架 idMotif,以支持领域专家在蛋白质序列中识别模体。模体是氨基酸的短序列,通常与蛋白质的特定功能相关联,识别蛋白质序列中的相似模体有助于我们预测某些类型的疾病或感染。idMotif 可用于探索、分析和可视化蛋白质序列中的这些模体。我们介绍了一种基于深度学习的蛋白质序列分组方法,并允许用户根据深度学习模型决策的局部解释,发现蛋白质组的候选模体。idMotif 为蛋白质簇/组和序列分析之间和内部提供了几个交互式链接视图。通过案例研究和专家反馈,我们展示了该框架如何帮助领域专家分析蛋白质序列和识别模体。