Department of Oncology, University of Cambridge, Cambridge, Cambridgeshire, United Kingdom.
PLoS One. 2012;7(8):e41815. doi: 10.1371/journal.pone.0041815. Epub 2012 Aug 9.
Sample tracking errors have been and always will be a part of the practical implementation of large experiments. It has recently been proposed that expression quantitative trait loci (eQTLs) and their associated effects could be used to identify sample mix-ups and this approach has been applied to a number of large population genomics studies to illustrate the prevalence of the problem. We had adopted a similar approach, termed 'BADGER', in the METABRIC project. METABRIC is a large breast cancer study that may have been the first in which eQTL-based detection of mismatches was used during the study, rather than after the event, to aid quality assurance. We report here on the particular issues associated with large cancer studies performed using historical samples, which complicate the interpretation of such approaches. In particular we identify the complications of using tumour samples, of considering cellularity and RNA quality, of distinct subgroups existing in the study population (including family structures), and of choosing eQTLs to use. We also present some results regarding the design of experiments given consideration of these matters. The eQTL-based approach to identifying sample tracking errors is seen to be of value to these studies, but requiring care in its implementation.
样本跟踪误差一直以来都是大型实验实际实施过程中的一部分。最近有人提出,可以利用表达数量性状基因座(eQTL)及其相关效应来识别样本混淆,并且这种方法已经应用于许多大型人群基因组学研究中,以说明该问题的普遍性。在 METABRIC 项目中,我们采用了一种类似的方法,称为“BADGER”。METABRIC 是一项大型乳腺癌研究,可能是首次在研究过程中使用基于 eQTL 的错配检测,而不是在事件发生后,以帮助质量保证。我们在此报告与使用历史样本进行的大型癌症研究相关的特定问题,这些问题使此类方法的解释变得复杂。特别是,我们确定了使用肿瘤样本、考虑细胞数量和 RNA 质量、研究人群中存在的不同亚组(包括家族结构)以及选择要使用的 eQTL 所带来的复杂性。我们还根据这些因素的考虑,介绍了一些关于实验设计的结果。基于 eQTL 的方法识别样本跟踪误差被认为对这些研究具有价值,但在实施过程中需要谨慎。