Suppr超能文献

SeeCiTe:一种利用三联体数据评估单核苷酸多态性阵列中拷贝数变异检测结果的方法。

SeeCiTe: a method to assess CNV calls from SNP arrays using trio data.

作者信息

Lavrichenko Ksenia, Helgeland Øyvind, Njølstad Pål R, Jonassen Inge, Johansson Stefan

机构信息

Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway.

Department of Clinical Science, University of Bergen, Bergen, Norway.

出版信息

Bioinformatics. 2021 Jul 27;37(13):1876-1883. doi: 10.1093/bioinformatics/btab028.

Abstract

MOTIVATION

Single nucleotide polymorphism (SNP) genotyping arrays remain an attractive platform for assaying copy number variants (CNVs) in large population-wide cohorts. However, current tools for calling CNVs are still prone to extensive false positive calls when applied to biobank scale arrays. Moreover, there is a lack of methods exploiting cohorts with trios available (e.g. nuclear family) to assist in quality control and downstream analyses following the calling.

RESULTS

We developed SeeCiTe (Seeing CNVs in Trios), a novel CNV-quality control tool that postprocesses output from current CNV-calling tools exploiting child-parent trio data to classify calls in quality categories and provide a set of visualizations for each putative CNV call in the offspring. We apply it to the Norwegian Mother, Father and Child Cohort Study (MoBa) and show that SeeCiTe improves the specificity and sensitivity compared to the common empiric filtering strategies. To our knowledge, it is the first tool that utilizes probe-level CNV data in trios (and singletons) to systematically highlight potential artifacts and visualize signal intensities in a streamlined fashion suitable for biobank scale studies.

AVAILABILITY AND IMPLEMENTATION

The software is implemented in R with the source code freely available at https://github.com/aksenia/SeeCiTe.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

单核苷酸多态性(SNP)基因分型阵列仍然是在大规模人群队列中检测拷贝数变异(CNV)的一个有吸引力的平台。然而,当前用于检测CNV的工具在应用于生物样本库规模的阵列时,仍然容易产生大量的假阳性结果。此外,缺乏利用包含三人组(如核心家庭)的队列来协助呼叫后的质量控制和下游分析的方法。

结果

我们开发了SeeCiTe(在三人组中检测CNV),这是一种新颖的CNV质量控制工具,它对当前CNV呼叫工具的输出进行后处理,利用子-父三人组数据将呼叫分类到质量类别中,并为后代中的每个假定CNV呼叫提供一组可视化。我们将其应用于挪威母婴队列研究(MoBa),结果表明与常见的经验性过滤策略相比,SeeCiTe提高了特异性和敏感性。据我们所知,它是第一个利用三人组(和单例)中的探针级CNV数据来系统地突出潜在伪影并以适合生物样本库规模研究的简化方式可视化信号强度的工具。

可用性和实现方式

该软件用R实现,源代码可在https://github.com/aksenia/SeeCiTe上免费获取。

补充信息

补充数据可在《生物信息学》在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4976/8317106/f14a711d2dc3/btab028f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验