• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种新的非线性降维方法,用于推断低覆盖测序数据的群体结构。

A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data.

机构信息

Interdisciplinary Program in Statistics and Data Science, University of Arizona, 617 N. Santa Rita Ave., 85721, Tucson, USA.

Department of Mathematics, University of Arizona, 617 N. Santa Rita Ave., 85721, Tucson, USA.

出版信息

BMC Bioinformatics. 2021 Jun 26;22(1):348. doi: 10.1186/s12859-021-04265-7.

DOI:10.1186/s12859-021-04265-7
PMID:34174829
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8236193/
Abstract

BACKGROUND

Low-depth sequencing allows researchers to increase sample size at the expense of lower accuracy. To incorporate uncertainties while maintaining statistical power, we introduce MCPCA_PopGen to analyze population structure of low-depth sequencing data.

RESULTS

The method optimizes the choice of nonlinear transformations of dosages to maximize the Ky Fan norm of the covariance matrix. The transformation incorporates the uncertainty in calling between heterozygotes and the common homozygotes for loci having a rare allele and is more linear when both variants are common.

CONCLUSIONS

We apply MCPCA_PopGen to samples from two indigenous Siberian populations and reveal hidden population structure accurately using only a single chromosome. The MCPCA_PopGen package is available on https://github.com/yiwenstat/MCPCA_PopGen .

摘要

背景

低深度测序允许研究人员以牺牲准确性为代价来增加样本量。为了在保持统计功效的同时纳入不确定性,我们引入了 MCPCA_PopGen 来分析低深度测序数据的群体结构。

结果

该方法优化了剂量的非线性变换选择,以最大化协方差矩阵的 Ky Fan 范数。这种变换结合了稀有等位基因位点中杂合子和常见纯合子之间的调用不确定性,并且在两种变体都很常见时更加线性。

结论

我们将 MCPCA_PopGen 应用于来自两个西伯利亚原住民群体的样本,并仅使用单个染色体准确地揭示隐藏的群体结构。MCPCA_PopGen 软件包可在 https://github.com/yiwenstat/MCPCA_PopGen 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/dbc35242fd13/12859_2021_4265_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/ad47a766a33e/12859_2021_4265_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/29da8e1683bb/12859_2021_4265_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/6cb6fff81b26/12859_2021_4265_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/628bcf1dd703/12859_2021_4265_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/42d2cf22ade8/12859_2021_4265_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/dbc35242fd13/12859_2021_4265_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/ad47a766a33e/12859_2021_4265_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/29da8e1683bb/12859_2021_4265_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/6cb6fff81b26/12859_2021_4265_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/628bcf1dd703/12859_2021_4265_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/42d2cf22ade8/12859_2021_4265_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6846/8236193/dbc35242fd13/12859_2021_4265_Fig6_HTML.jpg

相似文献

1
A novel nonlinear dimension reduction approach to infer population structure for low-coverage sequencing data.一种新的非线性降维方法,用于推断低覆盖测序数据的群体结构。
BMC Bioinformatics. 2021 Jun 26;22(1):348. doi: 10.1186/s12859-021-04265-7.
2
Low-depth genotyping-by-sequencing (GBS) in a bovine population: strategies to maximize the selection of high quality genotypes and the accuracy of imputation.牛群中的低深度测序基因分型(GBS):最大化高质量基因型选择和归因准确性的策略。
BMC Genet. 2017 Apr 5;18(1):32. doi: 10.1186/s12863-017-0501-y.
3
polyRAD: Genotype Calling with Uncertainty from Sequencing Data in Polyploids and Diploids.polyRAD:多倍体和二倍体测序数据不确定性下的基因型分型
G3 (Bethesda). 2019 Mar 7;9(3):663-673. doi: 10.1534/g3.118.200913.
4
Genotype-Frequency Estimation from High-Throughput Sequencing Data.高通量测序数据的基因型频率估计。
Genetics. 2015 Oct;201(2):473-86. doi: 10.1534/genetics.115.179077. Epub 2015 Jul 29.
5
ngsTools: methods for population genetics analyses from next-generation sequencing data.ngsTools:从下一代测序数据中进行群体遗传学分析的方法。
Bioinformatics. 2014 May 15;30(10):1486-7. doi: 10.1093/bioinformatics/btu041. Epub 2014 Jan 23.
6
Very low-depth whole-genome sequencing in complex trait association studies.复杂性状关联研究中的极低深度全基因组测序。
Bioinformatics. 2019 Aug 1;35(15):2555-2561. doi: 10.1093/bioinformatics/bty1032.
7
Using genotype array data to compare multi- and single-sample variant calls and improve variant call sets from deep coverage whole-genome sequencing data.利用基因型阵列数据比较多样本和单样本变异检测结果,并改进来自深度覆盖全基因组测序数据的变异检测集。
Bioinformatics. 2017 Apr 15;33(8):1147-1153. doi: 10.1093/bioinformatics/btw786.
8
SNP calling by sequencing pooled samples.基于测序的混合样本 SNP 检测。
BMC Bioinformatics. 2012 Sep 20;13:239. doi: 10.1186/1471-2105-13-239.
9
Genotype Calling from Population-Genomic Sequencing Data.基于群体基因组测序数据的基因型分析
G3 (Bethesda). 2017 May 5;7(5):1393-1404. doi: 10.1534/g3.117.039008.
10
Genotype calling and phasing using next-generation sequencing reads and a haplotype scaffold.使用下一代测序reads 和单倍型支架进行基因型调用和相位分析。
Bioinformatics. 2013 Jan 1;29(1):84-91. doi: 10.1093/bioinformatics/bts632. Epub 2012 Oct 23.

引用本文的文献

1
Applications of single-cell RNA sequencing in rheumatoid arthritis.单细胞RNA测序在类风湿关节炎中的应用。
Front Immunol. 2024 Nov 12;15:1491318. doi: 10.3389/fimmu.2024.1491318. eCollection 2024.

本文引用的文献

1
Very low-depth whole-genome sequencing in complex trait association studies.复杂性状关联研究中的极低深度全基因组测序。
Bioinformatics. 2019 Aug 1;35(15):2555-2561. doi: 10.1093/bioinformatics/bty1032.
2
Siberian genetic diversity reveals complex origins of the Samoyedic-speaking populations.西伯利亚遗传多样性揭示了萨摩耶语族群的复杂起源。
Am J Hum Biol. 2018 Nov;30(6):e23194. doi: 10.1002/ajhb.23194. Epub 2018 Nov 8.
3
Genomic Analyses from Non-invasive Prenatal Testing Reveal Genetic Associations, Patterns of Viral Infections, and Chinese Population History.
非侵入性产前检测的基因组分析揭示了遗传关联、病毒感染模式和中国人口历史。
Cell. 2018 Oct 4;175(2):347-359.e14. doi: 10.1016/j.cell.2018.08.016.
4
Understanding 6th-century barbarian social organization and migration through paleogenomics.通过古基因组学了解 6 世纪蛮族的社会组织和迁徙。
Nat Commun. 2018 Sep 11;9(1):3547. doi: 10.1038/s41467-018-06024-4.
5
Genome-wide association analyses identify 39 new susceptibility loci for diverticular disease.全基因组关联分析确定了 39 个新的憩室病易感性位点。
Nat Genet. 2018 Oct;50(10):1359-1365. doi: 10.1038/s41588-018-0203-z. Epub 2018 Sep 3.
6
SNPrune: an efficient algorithm to prune large SNP array and sequence datasets based on high linkage disequilibrium.SNPrune:一种基于高度连锁不平衡的高效算法,用于修剪大型 SNP 数组和序列数据集。
Genet Sel Evol. 2018 Jun 26;50(1):34. doi: 10.1186/s12711-018-0404-z.
7
Rapid, ultra low coverage copy number profiling of cell-free DNA as a precision oncology screening strategy.作为一种精准肿瘤学筛查策略的游离DNA快速、超低覆盖度拷贝数分析
Oncotarget. 2017 Sep 22;8(52):89848-89866. doi: 10.18632/oncotarget.21163. eCollection 2017 Oct 27.
8
Analysis commons, a team approach to discovery in a big-data environment for genetic epidemiology.分析共享,大数据环境下遗传流行病学发现的团队方法。
Nat Genet. 2017 Oct 27;49(11):1560-1563. doi: 10.1038/ng.3968.
9
Exome Sequencing Provides Evidence of Polygenic Adaptation to a Fat-Rich Animal Diet in Indigenous Siberian Populations.外显子组测序为富含脂肪的动物饮食在西伯利亚原住民中的多基因适应性提供了证据。
Mol Biol Evol. 2017 Nov 1;34(11):2913-2926. doi: 10.1093/molbev/msx226.
10
Whole-genome sequencing approaches for conservation biology: Advantages, limitations and practical recommendations.保护生物学中的全基因组测序方法:优势、局限性及实用建议。
Mol Ecol. 2017 Oct;26(20):5369-5406. doi: 10.1111/mec.14264. Epub 2017 Sep 5.