Zhang Xiaohua Douglas, Wang Dandan, Sun Shixue, Zhang Heping
CRDA, Faculty of Health Sciences, University of Macau, Taipa, Macau 999078, China.
Department of Biostatistics, Yale University, New Haven, CT 06511, USA.
Bioinformatics. 2021 Apr 1;36(22-23):5299-5303. doi: 10.1093/bioinformatics/btaa1049.
High-throughput screening (HTS) is a vital automation technology in biomedical research in both industry and academia. The well-known Z-factor has been widely used as a gatekeeper to assure assay quality in an HTS study. However, many researchers and users may not have realized that Z-factor has major issues.
In this article, the following four major issues are explored and demonstrated so that researchers may use the Z-factor appropriately. First, the Z-factor violates the Pythagorean theorem of statistics. Second, there is no adjustment of sampling error in the application of the Z-factor for quality control (QC) in HTS studies. Third, the expectation of the sample-based Z-factor does not exist. Fourth, the thresholds in the Z-factor-based criterion lack a theoretical basis. Here, an approach to avoid these issues was proposed and new QC criteria under homoscedasticity were constructed so that researchers can choose a statistically grounded criterion for QC in the HTS studies. We implemented this approach in an R package and demonstrated its utility in multiple CRISPR/CAS9 or siRNA HTS studies.
The R package qcSSMDhomo is freely available from GitHub: https://github.com/Karena6688/qcSSMDhomo. The file qcSSMDhomo_1.0.0.tar.gz (for Windows) containing qcSSMDhomo is also available at Bioinformatics online. qcSSMDhomo is distributed under the GNU General Public License.
Supplementary data are available at Bioinformatics online.
高通量筛选(HTS)是工业界和学术界生物医学研究中一项至关重要的自动化技术。著名的Z因子已被广泛用作把关指标,以确保高通量筛选研究中的检测质量。然而,许多研究人员和用户可能并未意识到Z因子存在重大问题。
在本文中,探讨并论证了以下四个主要问题,以便研究人员能够恰当地使用Z因子。第一,Z因子违反了统计学的毕达哥拉斯定理。第二,在高通量筛选研究中,将Z因子应用于质量控制(QC)时未对抽样误差进行调整。第三,基于样本的Z因子的期望值不存在。第四,基于Z因子的标准中的阈值缺乏理论依据。在此,提出了一种避免这些问题的方法,并构建了同方差下的新质量控制标准,以便研究人员能够在高通量筛选研究中选择一个基于统计学的质量控制标准。我们在一个R包中实现了这种方法,并在多个CRISPR/CAS9或siRNA高通量筛选研究中展示了其效用。
R包qcSSMDhomo可从GitHub免费获取:https://github.com/Karena6688/qcSSMDhomo。包含qcSSMDhomo的文件qcSSMDhomo_1.0.0.tar.gz(适用于Windows)也可在在线生物信息学网站获取。qcSSMDhomo根据GNU通用公共许可证发布。
补充数据可在在线生物信息学网站获取。