Barrows Nicholas J, Le Sommer Caroline, Garcia-Blanco Mariano A, Pearson James L
Department of Molecular Genetics and Microbiology, Duke-NUS Graduate Medical School, Durham, NC, USA.
J Biomol Screen. 2010 Aug;15(7):735-47. doi: 10.1177/1087057110374994. Epub 2010 Jul 12.
RNA interference-based screening is a powerful new genomic technology that addresses gene function en masse. To evaluate factors influencing hit list composition and reproducibility, the authors performed 2 identically designed small interfering RNA (siRNA)-based, whole-genome screens for host factors supporting yellow fever virus infection. These screens represent 2 separate experiments completed 5 months apart and allow the direct assessment of the reproducibility of a given siRNA technology when performed in the same environment. Candidate hit lists generated by sum rank, median absolute deviation, z-score, and strictly standardized mean difference were compared within and between whole-genome screens. Application of these analysis methodologies within a single screening data set using a fixed threshold equivalent to a p-value < or = 0.001 resulted in hit lists ranging from 82 to 1140 members and highlighted the tremendous impact analysis methodology has on hit list composition. Intra- and interscreen reproducibility was significantly influenced by the analysis methodology and ranged from 32% to 99%. This study also highlighted the power of testing at least 2 independent siRNAs for each gene product in primary screens. To facilitate validation, the authors conclude by suggesting methods to reduce false discovery at the primary screening stage. In this study, they present the first comprehensive comparison of multiple analysis strategies and demonstrate the impact of the analysis methodology on the composition of the "hit list." Therefore, they propose that the entire data set derived from functional genome-scale screens, especially if publicly funded, should be made available as is done with data derived from gene expression and genome-wide association studies.
基于RNA干扰的筛选是一种强大的新型基因组技术,可大规模研究基因功能。为了评估影响命中列表组成和可重复性的因素,作者进行了两项设计相同的基于小干扰RNA(siRNA)的全基因组筛选,以寻找支持黄热病病毒感染的宿主因子。这两项筛选代表了相隔5个月完成的两个独立实验,能够直接评估在相同环境下使用给定siRNA技术的可重复性。通过总和排名、中位数绝对偏差、z分数和严格标准化平均差生成的候选命中列表在全基因组筛选内部和之间进行了比较。在单个筛选数据集中使用相当于p值<或 = 0.001的固定阈值应用这些分析方法,得到的命中列表成员数量从82到1140不等,并突出了分析方法对命中列表组成的巨大影响。分析方法对筛选内和筛选间的可重复性有显著影响,范围从32%到99%。这项研究还强调了在初次筛选中对每个基因产物测试至少2种独立siRNA的作用。为了便于验证,作者最后提出了在初次筛选阶段减少错误发现的方法。在这项研究中,他们首次对多种分析策略进行了全面比较,并证明了分析方法对“命中列表”组成的影响。因此,他们建议,源自功能基因组规模筛选的整个数据集,特别是如果由公共资金资助,应像基因表达和全基因组关联研究的数据那样原样提供。