• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SHEsisPCA:一种基于 GPU 的用于校正群体分层的软件,它可以有效地加速处理全基因组数据集的过程。

SHEsisPCA: a GPU-based software to correct for population stratification that efficiently accelerates the process for handling genome-wide datasets.

机构信息

Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Shanghai Jiao Tong University, Shanghai 200230, China; Institute of Social Cognitive and Behavioral Sciences, Shanghai Jiao Tong University, Shanghai 200240, China; School of Bio-medical Engineering, Shanghai Jiao Tong University, Shanghai 200230, China.

Bio-X Institutes, Key Laboratory for the Genetics of Developmental and Neuropsychiatric Disorders (Ministry of Education), Shanghai Jiao Tong University, Shanghai 200230, China; Institute of Social Cognitive and Behavioral Sciences, Shanghai Jiao Tong University, Shanghai 200240, China.

出版信息

J Genet Genomics. 2015 Aug 20;42(8):445-53. doi: 10.1016/j.jgg.2015.06.007. Epub 2015 Jul 9.

DOI:10.1016/j.jgg.2015.06.007
PMID:26336801
Abstract

Population stratification is a problem in genetic association studies because it is likely to highlight loci that underlie the population structure rather than disease-related loci. At present, principal component analysis (PCA) has been proven to be an effective way to correct for population stratification. However, the conventional PCA algorithm is time-consuming when dealing with large datasets. We developed a Graphic processing unit (GPU)-based PCA software named SHEsisPCA (http://analysis.bio-x.cn/SHEsisMain.htm) that is highly parallel with a highest speedup greater than 100 compared with its CPU version. A cluster algorithm based on X-means was also implemented as a way to detect population subgroups and to obtain matched cases and controls in order to reduce the genomic inflation and increase the power. A study of both simulated and real datasets showed that SHEsisPCA ran at an extremely high speed while the accuracy was hardly reduced. Therefore, SHEsisPCA can help correct for population stratification much more efficiently than the conventional CPU-based algorithms.

摘要

群体分层是遗传关联研究中的一个问题,因为它很可能突出显示构成群体结构的基因座,而不是与疾病相关的基因座。目前,主成分分析(PCA)已被证明是一种纠正群体分层的有效方法。然而,传统的 PCA 算法在处理大型数据集时耗时较长。我们开发了一种基于图形处理单元(GPU)的 PCA 软件,名为 SHEsisPCA(http://analysis.bio-x.cn/SHEsisMain.htm),它具有高度的并行性,与 CPU 版本相比,最高加速比大于 100。还实现了一种基于 X-means 的聚类算法,以检测群体亚群,并获得匹配的病例和对照,以减少基因组膨胀并提高功效。对模拟和真实数据集的研究表明,SHEsisPCA 的运行速度极快,而准确性几乎没有降低。因此,SHEsisPCA 可以帮助比传统的基于 CPU 的算法更有效地纠正群体分层。

相似文献

1
SHEsisPCA: a GPU-based software to correct for population stratification that efficiently accelerates the process for handling genome-wide datasets.SHEsisPCA:一种基于 GPU 的用于校正群体分层的软件,它可以有效地加速处理全基因组数据集的过程。
J Genet Genomics. 2015 Aug 20;42(8):445-53. doi: 10.1016/j.jgg.2015.06.007. Epub 2015 Jul 9.
2
Study of large and highly stratified population datasets by combining iterative pruning principal component analysis and structure.结合迭代修剪主成分分析和结构对大型高度分层人群数据集进行研究。
BMC Bioinformatics. 2011 Jun 23;12:255. doi: 10.1186/1471-2105-12-255.
3
Evaluation of methods for adjusting population stratification in genome-wide association studies: Standard versus categorical principal component analysis.全基因组关联研究中调整群体分层方法的评估:标准主成分分析与分类主成分分析
Ann Hum Genet. 2019 Nov;83(6):454-464. doi: 10.1111/ahg.12339. Epub 2019 Jul 19.
4
Establishment of a standardized system to perform population structure analyses with limited sample size or with different sets of SNP genotypes.建立一个标准化系统,以在样本量有限或具有不同 SNP 基因型集的情况下进行群体结构分析。
J Hum Genet. 2010 Aug;55(8):525-33. doi: 10.1038/jhg.2010.63. Epub 2010 Jun 17.
5
GRAF-pop: A Fast Distance-Based Method To Infer Subject Ancestry from Multiple Genotype Datasets Without Principal Components Analysis.GRAF-pop:一种无需主成分分析即可基于距离推断个体祖先的快速方法,适用于多种基因型数据集。
G3 (Bethesda). 2019 Aug 8;9(8):2447-2461. doi: 10.1534/g3.118.200925.
6
Inference of population structure using genetic markers and a Bayesian model averaging approach for clustering.利用遗传标记和贝叶斯模型平均聚类方法推断群体结构。
J Comput Biol. 2008 Mar;15(2):207-20. doi: 10.1089/cmb.2007.0051.
7
Parallel GPU implementation of iterative PCA algorithms.迭代主成分分析算法的并行GPU实现
J Comput Biol. 2009 Nov;16(11):1593-9. doi: 10.1089/cmb.2008.0221.
8
WebStruct and VisualStruct: Web interfaces and visualization for Structure software implemented in a cluster environment.WebStruct和VisualStruct:在集群环境中实现的Structure软件的Web界面与可视化工具。
J Integr Bioinform. 2008 Sep 24;5(1):89. doi: 10.2390/biecoll-jib-2008-89.
9
A real-time spike sorting method based on the embedded GPU.一种基于嵌入式图形处理器的实时尖峰排序方法。
Annu Int Conf IEEE Eng Med Biol Soc. 2017 Jul;2017:1010-1013. doi: 10.1109/EMBC.2017.8036997.
10
Fast box-counting algorithm on GPU.GPU 上的快速盒计数算法。
Comput Methods Programs Biomed. 2012 Dec;108(3):1229-42. doi: 10.1016/j.cmpb.2012.07.005. Epub 2012 Aug 20.