Department of Hematology, Erasmus University Medical Center, GE Rotterdam, The Netherlands.
Hum Gene Ther. 2012 Nov;23(11):1209-19. doi: 10.1089/hum.2011.037. Epub 2012 Sep 27.
Introducing therapeutic genes into hematopoietic stem cells using retroviral vector-mediated gene transfer is an effective treatment for monogenic diseases. The risks of therapeutic gene integration include aberrant expression of a neighboring gene, resulting in oncogenesis at low frequencies (10(-7)-10(-6)/transduced cell). Mechanisms governing insertional mutagenesis are the subject of intensive ongoing studies that produce large amounts of sequencing data representing genomic regions flanking viral integration sites (IS). Validating and analyzing these data require automated bioinformatics applications. The exact methods used vary between applications, based on the requirements and preferences of the designer. The parameters used to analyze sequence data are capable of shaping the resulting integration site annotations, but a comprehensive examination of these effects is lacking. Here we present a web-based tool for integration site analysis, called Methods for Analyzing ViRal Integration Collections (MAVRIC), and use its highly customizable interface to look at how IS annotations can vary based on the analysis parameters. We used the integration data of the previously published adenosine deaminase severe combined immunodeficiency (ADA-SCID) gene therapy trials for evaluation of MAVRIC. The output illustrates how MAVRIC allows for direct multiparameter comparison of integration patterns. Careful analysis of the SCID data and reanalyses using different parameters for trimming, alignment, and repeat masking revealed the degree of variation that can be expected to arise due to changes in these parameters. We observed mainly small differences in annotation, with the largest effects caused by masking repeat sequences and by changing the size of the window around the IS.
利用逆转录病毒载体介导的基因转移将治疗基因导入造血干细胞是治疗单基因疾病的有效方法。治疗基因整合的风险包括邻近基因的异常表达,导致低频率的致癌作用(10(-7)-10(-6)/转导细胞)。插入诱变的机制是正在进行的密集研究的主题,这些研究产生了大量代表病毒整合位点(IS)侧翼基因组区域的测序数据。验证和分析这些数据需要自动化的生物信息学应用程序。具体方法因应用程序而异,基于设计者的要求和偏好。用于分析序列数据的参数能够塑造产生的整合位点注释,但缺乏对这些影响的全面检查。在这里,我们提出了一种用于整合位点分析的基于网络的工具,称为病毒整合收集分析方法(MAVRIC),并使用其高度可定制的界面来查看 IS 注释如何根据分析参数而变化。我们使用先前发表的腺苷脱氨酶严重联合免疫缺陷(ADA-SCID)基因治疗试验的整合数据来评估 MAVRIC。输出结果说明了 MAVRIC 如何允许直接对整合模式进行多参数比较。仔细分析 SCID 数据并使用不同的参数重新分析进行修剪、对齐和重复掩蔽,揭示了由于这些参数的变化可能会出现的变化程度。我们观察到注释的主要差异较小,最大的影响是由重复序列掩蔽和 IS 周围窗口大小的变化引起的。