Department of Microbiology, Harvard Medical School, Boston, Massachusetts, USA.
Division of Infectious Diseases, Brigham & Women's Hospital, Boston, Massachusetts, USA.
mSphere. 2019 Feb 20;4(1):e00031-19. doi: 10.1128/mSphere.00031-19.
Transposon insertion sequencing (TIS) is a widely used technique for conducting genome-scale forward genetic screens in bacteria. However, few methods enable comparison of TIS data across multiple replicates of a screen or across independent screens, including screens performed in different organisms. Here, we introduce a analytic framework, comparative TIS (CompTIS), which utilizes unsupervised learning to enable meta-analysis of multiple TIS data sets. CompTIS first implements screen-level principal-component analysis (PCA) and clustering to identify variation between the TIS screens. This initial screen-level analysis facilitates the selection of related screens for additional analyses, reveals the relatedness of complex environments based on growth phenotypes measured by TIS, and provides a useful quality control step. Subsequently, PCA is performed on genes to identify loci whose corresponding mutants lead to concordant/discordant phenotypes across all or in a subset of screens. We used CompTIS to analyze published intestinal colonization TIS data sets from two vibrio species. Gene-level analyses identified both pan-vibrio genes required for intestinal colonization and conserved genes that displayed species-specific requirements. CompTIS is applicable to virtually any combination of TIS screens and can be implemented without regard to either the number of screens or the methods used for upstream data analysis. Forward genetic screens are powerful tools for functional genomics. The comparison of similar forward genetic screens performed in different organisms enables the identification of genes with similar or different phenotypes across organisms. Transposon insertion sequencing is a widely used method for conducting genome-scale forward genetic screens in bacteria, yet few bioinformatic approaches have been developed to compare the results of screen replicates and different screens conducted across species or strains. Here, we used principal-component analysis (PCA) and hierarchical clustering, two unsupervised learning approaches, to analyze the relatedness of multiple screens of pathogenic vibrios. This analytic framework reveals both shared pan-vibrio requirements for intestinal colonization and strain-specific dependencies. Our findings suggest that PCA-based analytics will be a straightforward widely applicable approach for comparing diverse transposon insertion sequencing screens.
转座子插入测序(TIS)是一种广泛用于在细菌中进行全基因组正向遗传筛选的技术。然而,很少有方法能够比较筛选的多个重复或跨独立筛选的数据,包括在不同生物体中进行的筛选。在这里,我们引入了一种分析框架,比较 TIS(CompTIS),它利用无监督学习来实现多个 TIS 数据集的元分析。CompTIS 首先实现了屏幕级主成分分析(PCA)和聚类,以识别 TIS 屏幕之间的变化。这种初始的屏幕级分析有助于选择用于进一步分析的相关屏幕,根据 TIS 测量的生长表型揭示复杂环境的相关性,并提供有用的质量控制步骤。随后,对基因进行 PCA,以识别其相应突变体在所有或部分筛选中导致一致/不一致表型的基因座。我们使用 CompTIS 分析了两个弧菌属物种的肠道定植 TIS 数据。基因水平分析确定了肠道定植所需的泛弧菌基因和显示物种特异性要求的保守基因。CompTIS 几乎适用于任何 TIS 筛选的组合,并且可以在不考虑筛选数量或用于上游数据分析的方法的情况下实施。正向遗传筛选是功能基因组学的强大工具。在不同生物体中进行的类似正向遗传筛选的比较可以识别在生物体之间具有相似或不同表型的基因。转座子插入测序是一种广泛用于在细菌中进行全基因组正向遗传筛选的方法,但很少有生物信息学方法被开发来比较跨物种或菌株进行的筛选重复和不同筛选的结果。在这里,我们使用主成分分析(PCA)和层次聚类这两种无监督学习方法来分析致病性弧菌的多个筛选的相关性。这种分析框架揭示了肠道定植的共享泛弧菌要求和菌株特异性依赖性。我们的研究结果表明,基于 PCA 的分析将是一种简单而广泛适用的方法,用于比较不同的转座子插入测序筛选。