Institut de Recherche en Biologie Végétale, Département de Sciences Biologiques, Université de Montréal, QC H1X 2B2, Canada.
Department of Epidemiology, Biostatistics and Occupational Health, McGill University, Montréal, QC H3A 1A2, Canada.
Genomics Proteomics Bioinformatics. 2020 Oct;18(5):613-623. doi: 10.1016/j.gpb.2018.07.011. Epub 2020 Dec 18.
In this study, we introduce a novel bioinformatics program, Spore-associated Symbiotic Microbes Position-specific Function (SeSaMe PS Function), for position-specific functional analysis of short sequences derived from metagenome sequencing data of the arbuscular mycorrhizal fungi. The unique advantage of the program lies in databases created based on genus-specific sequence properties derived from protein secondary structure, namely amino acid usages, codon usages, and codon contexts of 3-codon DNA 9-mers. SeSaMe PS Function searches a query sequence against reference sequence database, identifies 3-codon DNA 9-mers with structural roles, and creates a comparative dataset containing the codon usage biases of the 3-codon DNA 9-mers from 54 bacterial and fungal genera. The program applies correlation principal component analysis in conjunction with K-means clustering method to the comparative dataset. 3-codon DNA 9-mers clustered as a sole member or with only a few members are often structurally and functionally distinctive sites that provide useful insights into important molecular interactions. The program provides a versatile means for studying functions of short sequences from metagenome sequencing and has a wide spectrum of applications. SeSaMe PS Function is freely accessible at www.fungalsesame.org.
在这项研究中,我们引入了一个新的生物信息学程序,即孢子相关共生微生物位置特异性功能(SeSaMe PS Function),用于分析丛枝菌根真菌宏基因组测序数据中短序列的位置特异性功能。该程序的独特优势在于基于种属特异性序列特性创建的数据库,这些特性源自蛋白质二级结构,即氨基酸使用、密码子使用和三核苷酸 DNA 9-mers 的密码子上下文。SeSaMe PS Function 会将查询序列与参考序列数据库进行比对,识别具有结构作用的三核苷酸 DNA 9-mers,并创建一个包含来自 54 个细菌和真菌属的三核苷酸 DNA 9-mers 的密码子使用偏倚的比较数据集。该程序将相关主成分分析与 K-均值聚类方法应用于比较数据集。聚类为单一成员或只有少数成员的三核苷酸 DNA 9-mers 通常是具有结构和功能独特性的位点,为重要的分子相互作用提供了有用的见解。该程序为研究宏基因组测序中短序列的功能提供了一种通用的方法,具有广泛的应用。SeSaMe PS Function 可在 www.fungalsesame.org 上免费获取。