Padilla-Del Valle Ricky, Morales-Vale Luis R, Ríos-Velázquez Carlos
Biology Department, University of Puerto Rico at Mayaguez, 108 Street Bo. Miradero Km 1.3, Zoo Entrance, Mayagüez 00680, Puerto Rico.
Genom Data. 2016 Dec 23;11:98-101. doi: 10.1016/j.gdata.2016.12.010. eCollection 2017 Mar.
In Puerto Rico, the microbial diversity of the thermal spring (ThS) in Coamo has never been studied using metagenomics. The focus of our research was to generate a metagenomic library from the ThS of Coamo, Puerto Rico and explore the microbial and functional diversity. The metagenomic library from the ThS waters was generated using direct DNA isolation. High molecular weight (40 kbp) DNA was end-repaired, electro eluted and ligated into a fosmid vector (pCCFOS1); then transduced into EPI300-T1 using T1 bacteriophages. The library consisted of approximately 6000 clones, 90% containing metagenomic DNA. Next-Generation-Sequencing technology (Illumina MiSeq) was used to process the ThS metagenome. After removing the cloning vector, 122,026 sequences with 33.10 Mbps size and 64% of G + C content were annotated and analyzed using the MG-RAST online server. Bacteria showed to be the most abundant domain (95.84%) followed by unidentified sequences (2.28%), viruses (1.67%), eukaryotes (0.15%), and archaea (0.01%). The most abundant phyla were (95.03%), followed by unidentified (2.28%), unclassified from viruses (1.74%), (0.20%) and (0.18%). The most abundant species were , , and sp. Subsystem functional analysis showed that 20% of genes belong to transposable elements, 10% to clustering-based subsystems, and 8% to the production of cofactors. Functional analysis using NOG annotation showed that 82.79% of proteins are poorly characterized indicating the possibility of novel microbial functions and with potential biomedical and biotechnological applications. Metagenomic data was deposited into the NCBI database under the accession number SAMN06131862.
在波多黎各,从未使用宏基因组学研究过科阿莫温泉(ThS)的微生物多样性。我们研究的重点是从波多黎各科阿莫的ThS中构建一个宏基因组文库,并探索微生物和功能多样性。通过直接DNA分离构建了来自ThS水域的宏基因组文库。对高分子量(40 kbp)DNA进行末端修复、电洗脱并连接到fosmid载体(pCCFOS1)中;然后使用T1噬菌体转导到EPI300-T1中。该文库由大约6000个克隆组成,90%包含宏基因组DNA。使用下一代测序技术(Illumina MiSeq)处理ThS宏基因组。去除克隆载体后,使用MG-RAST在线服务器对122,026条序列(大小为33.10 Mbps,G + C含量为64%)进行注释和分析。细菌是最丰富的域(95.84%),其次是未鉴定序列(2.28%)、病毒(1.67%)、真核生物(0.15%)和古细菌(0.01%)。最丰富的门是 (95.03%),其次是未鉴定的(2.28%)、病毒未分类的(1.74%)、 (0.20%)和 (0.18%)。最丰富的物种是 、 、 和 属。子系统功能分析表明,20%的基因属于转座元件,10%属于基于聚类的子系统,8%属于辅因子的产生。使用NOG注释的功能分析表明,82.79%的蛋白质特征不明确,这表明存在新的微生物功能,并具有潜在的生物医学和生物技术应用。宏基因组数据已存入NCBI数据库,登录号为SAMN06131862。