Suppr超能文献

患者来源模型基因组分析中鼠污染的影响及稳健分析的最佳实践。

Impact of mouse contamination in genomic profiling of patient-derived models and best practice for robust analysis.

机构信息

Department of Biomedical Systems Informatics and Brain Korea 21 PLUS Project for Medical Science, Yonsei University College of Medicine, Seoul, 03722, South Korea.

出版信息

Genome Biol. 2019 Nov 11;20(1):231. doi: 10.1186/s13059-019-1849-2.

Abstract

BACKGROUND

Patient-derived xenograft and cell line models are popular models for clinical cancer research. However, the inevitable inclusion of a mouse genome in a patient-derived model is a remaining concern in the analysis. Although multiple tools and filtering strategies have been developed to account for this, research has yet to demonstrate the exact impact of the mouse genome and the optimal use of these tools and filtering strategies in an analysis pipeline.

RESULTS

We construct a benchmark dataset of 5 liver tissues from 3 mouse strains using human whole-exome sequencing kit. Next-generation sequencing reads from mouse tissues are mappable to 49% of the human genome and 409 cancer genes. In total, 1,207,556 mouse-specific alleles are aligned to the human genome reference, including 467,232 (38.7%) alleles with high sensitivity to contamination, which are pervasive causes of false cancer mutations in public databases and are signatures for predicting global contamination. Next, we assess the performance of 8 filtering methods in terms of mouse read filtration and reduction of mouse-specific alleles. All filtering tools generally perform well, although differences in algorithm strictness and efficiency of mouse allele removal are observed. Therefore, we develop a best practice pipeline that contains the estimation of contamination level, mouse read filtration, and variant filtration.

CONCLUSIONS

The inclusion of mouse cells in patient-derived models hinders genomic analysis and should be addressed carefully. Our suggested guidelines improve the robustness and maximize the utility of genomic analysis of these models.

摘要

背景

患者来源的异种移植和细胞系模型是临床癌症研究中常用的模型。然而,在分析中不可避免地包含了小鼠基因组,这仍然是一个令人关注的问题。尽管已经开发了多种工具和过滤策略来解决这个问题,但研究尚未证明小鼠基因组的确切影响,以及在分析管道中最佳使用这些工具和过滤策略。

结果

我们使用人类全外显子测序试剂盒构建了 3 个小鼠品系 5 个肝组织的基准数据集。来自小鼠组织的下一代测序reads 可映射到人类基因组的 49%和 409 个癌症基因。总共,1207556 个小鼠特异性等位基因与人类基因组参考序列对齐,包括 467232 个(38.7%)具有高污染敏感性的等位基因,这些等位基因是公共数据库中假癌症突变的普遍原因,也是预测全局污染的特征。接下来,我们评估了 8 种过滤方法在过滤小鼠reads 和减少小鼠特异性等位基因方面的性能。所有过滤工具通常表现良好,尽管观察到算法严格性和去除小鼠等位基因的效率存在差异。因此,我们开发了一种最佳实践管道,其中包含污染水平估计、小鼠 read 过滤和变异过滤。

结论

患者来源模型中包含的小鼠细胞阻碍了基因组分析,应谨慎处理。我们提出的指南提高了这些模型的基因组分析的稳健性和最大利用价值。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb62/6844030/adfdb0d4d4d6/13059_2019_1849_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验