Bich Goran, Monsellier Elodie, Travé Gilles, Nominé Yves
Equipe Labellisée Ligue 2015, Department of Integrated Structural Biology, Institut de Génétique et de Biologie Moléculaire et Cellulaire (IGBMC), CNRS UMR 7104, INSERM U1258, Université de Strasbourg, Illkirch 67404, France.
Bioinform Adv. 2023 Mar 9;3(1):vbad022. doi: 10.1093/bioadv/vbad022. eCollection 2023.
Studies of sets of proteins are a central point in biology. In particular, the application of omics in the last decades has generated lists of several hundreds or thousands of proteins or genes. However, these lists are often not inspected globally, possibly due to the lack of tools capable of simultaneously visualizing the feature architectures of a large number of proteins.
Here, we present ProFeatMap, an intuitive Python-based website. For a given set of proteins, it allows to display features such as domains, repeats, disorder or post-translational modifications and their organization along the sequences, into a highly customizable 2D map. Starting from a user-defined protein list of UniProt accession codes, ProFeatMap extracts the most important annotated features available for each protein from one of the well-established databases such as Uniprot or InterPro, allocates shapes and colors, potentially depending on quantitative or qualitative data and sorts the protein list based on homologous feature content. The resulting publication-quality map allows even large protein families to be explored, and to classify them based on shared features. It can help to gain insights, for example, feature redundancy or feature pattern, that were previously overlooked. ProFeatMap is freely available on the web at: https://profeatmap.pythonanywhere.com/.
Source code is freely accessible at https://github.com/profeatmap/ProFeatMap under the GPL license.
Supplementary data are available at online.
对蛋白质组的研究是生物学的核心内容。特别是在过去几十年中,组学的应用产生了数百或数千种蛋白质或基因的列表。然而,这些列表往往没有进行全局检查,这可能是由于缺乏能够同时可视化大量蛋白质特征结构的工具。
在这里,我们展示了ProFeatMap,一个基于Python的直观网站。对于给定的一组蛋白质,它可以将诸如结构域、重复序列、无序区域或翻译后修饰等特征及其沿序列的组织显示在一个高度可定制的二维图中。从用户定义的UniProt登录号蛋白质列表开始,ProFeatMap从诸如Uniprot或InterPro等成熟数据库之一中提取每个蛋白质可用的最重要注释特征,分配形状和颜色,这可能取决于定量或定性数据,并根据同源特征内容对蛋白质列表进行排序。生成的具有发表质量的图谱甚至可以用于探索大型蛋白质家族,并根据共享特征对它们进行分类。它有助于获得例如以前被忽视的特征冗余或特征模式等见解。ProFeatMap可在以下网址免费获取:https://profeatmap.pythonanywhere.com/。
源代码可在https://github.com/profeatmap/ProFeatMap上根据GPL许可免费获取。
补充数据可在网上获取。