Martínez-Porchas Marcel, Vargas-Albores Francisco
Centro de Investigación en Alimentación y Desarrollo, A. C. Km 0.6 Carretera a La Victoria. Hermosillo, Sonora, México.
Heliyon. 2017 Jul 27;3(7):e00370. doi: 10.1016/j.heliyon.2017.e00370. eCollection 2017 Jul.
The use of mers has been a successful strategy for improving metagenomics studies, including taxonomic classifications, or assemblies, and can be used to obtain sequences of interest from the available databases. The aim of this manuscript was to propose a simple but efficient strategy to generate mers and to use them to obtain and analyse 16S rRNA sequence fragments. A total of 513,309 bacterial sequences contained in the SILVA database were considered for the study, and homemade PHP scripts were used to search for specific nucleotide chains, recover fragments of bacterial sequences, make calculations and organize information. Consensus sequences matching conserved regions were constructed by aligning most of the primers used in the literature. Sequences of nucleotides (9- to 15-mers) were extracted from the generated primer contigs. Frequency analysis revealed that mer size was inversely proportional to the occurrence of mers in the different conserved regions, suggesting a stringency relationship; high numbers of duplicate reactions were observed with short mers, and a lower proportion of sequences were obtained with large ones, with the best results obtained using 12-mers. Using 12-mers with the proposed method to obtain and study sequences was found to be a reliable approach for the analysis of 16S rRNA sequences and this strategy may probably be extended to other biomarkers. Furthermore, additional applications such as evaluating the degree of conservation and designing primers and other calculations are proposed as examples.
mers的使用是一种成功的策略,可用于改进宏基因组学研究,包括分类学分类或组装,并且可用于从现有数据库中获取感兴趣的序列。本手稿的目的是提出一种简单而有效的策略来生成mers,并使用它们来获取和分析16S rRNA序列片段。本研究考虑了SILVA数据库中包含的总共513,309条细菌序列,并使用自制的PHP脚本搜索特定的核苷酸链、回收细菌序列片段、进行计算和整理信息。通过比对文献中使用的大多数引物构建了与保守区域匹配的共有序列。从生成的引物重叠群中提取核苷酸序列(9至15聚体)。频率分析表明,mers大小与不同保守区域中mers的出现呈反比,表明存在严格关系;短mers观察到大量重复反应,而长mers获得的序列比例较低,使用12聚体获得的结果最佳。发现使用12聚体和所提出的方法来获取和研究序列是分析16S rRNA序列的可靠方法,并且该策略可能会扩展到其他生物标志物。此外,还提出了其他应用,如评估保守程度、设计引物和进行其他计算等示例。