基于 Tn-seq 数据的基因插入容忍性分析的零膨胀泊松模型。

A zero-inflated Poisson model for insertion tolerance analysis of genes based on Tn-seq data.

机构信息

Department of Statistics, Iowa State University.

Department of Statistics, Iowa State University, Department of Veterinary Diagnostic and Production Animal Medicine and.

出版信息

Bioinformatics. 2016 Jun 1;32(11):1701-8. doi: 10.1093/bioinformatics/btw061. Epub 2016 Feb 1.

DOI:10.1093/bioinformatics/btw061

PMID:26833344

Abstract

MOTIVATION

Transposon insertion sequencing (Tn-seq) is an emerging technology that combines transposon mutagenesis with next-generation sequencing technologies for the identification of genes related to bacterial survival. The resulting data from Tn-seq experiments consist of sequence reads mapped to millions of potential transposon insertion sites and a large portion of insertion sites have zero mapped reads. Novel statistical method for Tn-seq data analysis is needed to infer functions of genes on bacterial growth.

RESULTS

In this article, we propose a zero-inflated Poisson model for analyzing the Tn-seq data that are high-dimensional and with an excess of zeros. Maximum likelihood estimates of model parameters are obtained using an expectation-maximization (EM) algorithm, and pseudogenes are utilized to construct appropriate statistical tests for the transposon insertion tolerance of normal genes of interest. We propose a multiple testing procedure that categorizes genes into each of the three states, hypo-tolerant, tolerant and hyper-tolerant, while controlling the false discovery rate. We evaluate the proposed method with simulation studies and apply the proposed method to a real Tn-seq data from an experiment that studied the bacterial pathogen, Campylobacter jejuniAvailability and implementation: We provide R code for implementing our proposed method at http://github.com/ffliu/TnSeq A user's guide with example data analysis is also available there.

CONTACT

pliu@iastate.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

转座子插入测序（Tn-seq）是一种新兴的技术，它将转座子诱变与下一代测序技术相结合，用于鉴定与细菌生存相关的基因。Tn-seq 实验产生的数据包括映射到数百万个潜在转座子插入位点的序列读数，并且很大一部分插入位点没有映射的读数。需要新的统计方法来分析 Tn-seq 数据，以推断细菌生长中基因的功能。

结果

在本文中，我们提出了一种用于分析 Tn-seq 数据的零膨胀泊松模型，这些数据具有高维性和大量的零值。使用期望最大化（EM）算法获得模型参数的最大似然估计，并且利用假基因来构建适当的统计检验，用于正常基因的转座子插入容忍度。我们提出了一种多重检验程序，将基因分类为三种状态，即低容忍度、容忍度和高容忍度，同时控制假发现率。我们通过模拟研究评估了所提出的方法，并将所提出的方法应用于研究细菌病原体空肠弯曲菌的真实 Tn-seq 数据。在那里还提供了用于实现我们提出的方法的 R 代码以及带有示例数据分析的用户指南。

联系方式

pliu@iastate.edu

补充信息

补充资料可在《生物信息学》在线获取。

相似文献

A zero-inflated Poisson model for insertion tolerance analysis of genes based on Tn-seq data.基于 Tn-seq 数据的基因插入容忍性分析的零膨胀泊松模型。

Bioinformatics. 2016 Jun 1;32(11):1701-8. doi: 10.1093/bioinformatics/btw061. Epub 2016 Feb 1.

Revealing Causes for False-Positive and False-Negative Calling of Gene Essentiality in Escherichia coli Using Transposon Insertion Sequencing.利用转座子插入测序揭示大肠杆菌中基因必需性假阳性和假阴性调用的原因。

mSystems. 2023 Feb 23;8(1):e0089622. doi: 10.1128/msystems.00896-22. Epub 2022 Dec 12.

Tn-seq explorer: a tool for analysis of high-throughput sequencing data of transposon mutant libraries.转座子测序探索者：一种用于分析转座子突变体文库高通量测序数据的工具。

PLoS One. 2015 May 4;10(5):e0126070. doi: 10.1371/journal.pone.0126070. eCollection 2015.

Genome-Wide Mutagenesis in Borrelia burgdorferi.伯氏疏螺旋体的全基因组诱变

Methods Mol Biol. 2018;1690:201-223. doi: 10.1007/978-1-4939-7383-5_16.

Classifying next-generation sequencing data using a zero-inflated Poisson model.使用零膨胀泊松模型对下一代测序数据进行分类。

Bioinformatics. 2018 Apr 15;34(8):1329-1335. doi: 10.1093/bioinformatics/btx768.

Methods for Tn-Seq Analysis in Acinetobacter baumannii.鲍曼不动杆菌Tn-Seq分析方法

Methods Mol Biol. 2019;1946:115-134. doi: 10.1007/978-1-4939-9118-1_12.

Preparation of Transposon Library and Tn-Seq Amplicon Library for Salmonella Typhimurium.鼠伤寒沙门氏菌转座子文库和Tn-Seq扩增子文库的制备

Methods Mol Biol. 2019;2016:3-15. doi: 10.1007/978-1-4939-9570-7_1.

Transposon sequencing: methods and expanding applications.转座子测序：方法和扩展应用。

Appl Microbiol Biotechnol. 2016 Jan;100(1):31-43. doi: 10.1007/s00253-015-7037-8.

The bare necessities: Uncovering essential and condition-critical genes with transposon sequencing.最基本的需求：通过转座子测序揭示必需和条件关键基因。

Mol Oral Microbiol. 2019 Apr;34(2):39-50. doi: 10.1111/omi.12256. Epub 2019 Mar 1.

Model-based identification of conditionally-essential genes from transposon-insertion sequencing data.基于模型的方法从转座子插入测序数据中鉴定条件必需基因。

PLoS Comput Biol. 2022 Mar 7;18(3):e1009273. doi: 10.1371/journal.pcbi.1009273. eCollection 2022 Mar.

引用本文的文献

Selection or drift: The population biology underlying transposon insertion sequencing experiments.选择或漂变：转座子插入测序实验背后的群体生物学

Comput Struct Biotechnol J. 2020 Mar 25;18:791-804. doi: 10.1016/j.csbj.2020.03.021. eCollection 2020.

Modelling RNA-Seq data with a zero-inflated mixture Poisson linear model.用零膨胀混合泊松线性模型对 RNA-Seq 数据进行建模。

Genet Epidemiol. 2019 Oct;43(7):786-799. doi: 10.1002/gepi.22246. Epub 2019 Jul 22.

No wisdom in the crowd: genome annotation in the era of big data - current status and future prospects.人群中没有智慧：大数据时代的基因组注释——现状与未来展望。

Microb Biotechnol. 2018 Jul;11(4):588-605. doi: 10.1111/1751-7915.13284. Epub 2018 May 28.

A Comprehensive Overview of Online Resources to Identify and Predict Bacterial Essential Genes.用于识别和预测细菌必需基因的在线资源综述

Front Microbiol. 2017 Nov 27;8:2331. doi: 10.3389/fmicb.2017.02331. eCollection 2017.

TnseqDiff: identification of conditionally essential genes in transposon sequencing studies.TnseqDiff：转座子测序研究中条件必需基因的鉴定

BMC Bioinformatics. 2017 Jul 6;18(1):326. doi: 10.1186/s12859-017-1745-2.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

基于 Tn-seq 数据的基因插入容忍性分析的零膨胀泊松模型。

A zero-inflated Poisson model for insertion tolerance analysis of genes based on Tn-seq data.

机构信息

出版信息

MOTIVATION

RESULTS

CONTACT

SUPPLEMENTARY INFORMATION

动机

结果

联系方式

补充信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献