Suppr超能文献

一种新的非线性降维方法,用于推断低覆盖测序数据的群体结构。

A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data.

机构信息

Interdisciplinary Program in Statistics and Data Science, University of Arizona, 617 N. Santa Rita Ave., 85721, Tucson, USA.

Department of Mathematics, University of Arizona, 617 N. Santa Rita Ave., 85721, Tucson, USA.

出版信息

BMC Bioinformatics. 2021 Jun 26;22(1):348. doi: 10.1186/s12859-021-04265-7.

Abstract

BACKGROUND

Low-depth sequencing allows researchers to increase sample size at the expense of lower accuracy. To incorporate uncertainties while maintaining statistical power, we introduce MCPCA_PopGen to analyze population structure of low-depth sequencing data.

RESULTS

The method optimizes the choice of nonlinear transformations of dosages to maximize the Ky Fan norm of the covariance matrix. The transformation incorporates the uncertainty in calling between heterozygotes and the common homozygotes for loci having a rare allele and is more linear when both variants are common.

CONCLUSIONS

We apply MCPCA_PopGen to samples from two indigenous Siberian populations and reveal hidden population structure accurately using only a single chromosome. The MCPCA_PopGen package is available on https://github.com/yiwenstat/MCPCA_PopGen .

摘要

背景

低深度测序允许研究人员以牺牲准确性为代价来增加样本量。为了在保持统计功效的同时纳入不确定性,我们引入了 MCPCA_PopGen 来分析低深度测序数据的群体结构。

结果

该方法优化了剂量的非线性变换选择,以最大化协方差矩阵的 Ky Fan 范数。这种变换结合了稀有等位基因位点中杂合子和常见纯合子之间的调用不确定性,并且在两种变体都很常见时更加线性。

结论

我们将 MCPCA_PopGen 应用于来自两个西伯利亚原住民群体的样本,并仅使用单个染色体准确地揭示隐藏的群体结构。MCPCA_PopGen 软件包可在 https://github.com/yiwenstat/MCPCA_PopGen 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/ad47a766a33e/12859_2021_4265_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验