Genomic and Applied Microbiology and Göttingen Genomics Laboratory, Institute of Microbiology and Genetics, Georg-August-University of Göttingen, 37077 Göttingen, Germany.
Institute of Medical Microbiology, University Medical Center Göttingen, 37075 Göttingen, Germany.
Sci Data. 2017 Oct 17;4:170152. doi: 10.1038/sdata.2017.152.
We present bacterial 16S rRNA gene datasets derived from stool samples of 44 patients with diarrhea indicative of a Clostridioides difficile infection. For 20 of these patients, C. difficile infection was confirmed by clinical evidence. Stool samples from patients originating from Germany, Ghana, and Indonesia were taken and subjected to DNA isolation. DNA isolations of stool samples from 35 asymptomatic control individuals were performed. The bacterial community structure was assessed by 16S rRNA gene analysis (V3-V4 region). Metadata from patients and control individuals include gender, age, country, presence of diarrhea, concomitant diseases, and results of microbiological tests to diagnose C. difficile presence. We provide initial data analysis and a dataset overview. After processing of paired-end sequencing data, reads were merged, quality-filtered, primer sequences removed, reads truncated to 400 bp and dereplicated. Singletons were removed and sequences were sorted by cluster size, clustered at 97% sequence similarity and chimeric sequences were discarded. Taxonomy to each operational taxonomic unit was assigned by BLASTn searches against Silva database 123.1 and a table was constructed.
我们提供了源自 44 名腹泻患者粪便样本的细菌 16S rRNA 基因数据集,这些患者的腹泻表明存在艰难梭菌感染。其中 20 名患者的艰难梭菌感染通过临床证据得到确认。我们采集了来自德国、加纳和印度尼西亚的患者的粪便样本,并进行 DNA 分离。对 35 名无症状对照个体的粪便样本进行了 DNA 分离。通过 16S rRNA 基因分析(V3-V4 区)评估细菌群落结构。患者和对照个体的元数据包括性别、年龄、国家、腹泻存在情况、伴随疾病以及用于诊断艰难梭菌存在的微生物学检测结果。我们提供了初步数据分析和数据集概述。在处理配对末端测序数据后,我们合并了读取、进行质量过滤、去除引物序列、将读取截断为 400bp 并去重复。去除单克隆,按簇大小排序序列,在 97%序列相似性处聚类,并丢弃嵌合体序列。通过与 Silva 数据库 123.1 的 BLASTn 搜索对每个操作分类单元进行分类,并构建了一个表格。