• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

全基因组关联研究中真实样本的正向时间模拟。

Forward-time simulation of realistic samples for genome-wide association studies.

机构信息

Department of Epidemiology, The University of Texas MD Anderson Cancer Center, Houston, Texas 77030, USA.

出版信息

BMC Bioinformatics. 2010 Sep 1;11:442. doi: 10.1186/1471-2105-11-442.

DOI:10.1186/1471-2105-11-442
PMID:20809983
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2939614/
Abstract

BACKGROUND

Forward-time simulations have unique advantages in power and flexibility for the simulation of genetic samples of complex human diseases because they can closely mimic the evolution of human populations carrying these diseases. However, a number of methodological and computational constraints have prevented the power of this simulation method from being fully explored in existing forward-time simulation methods.

RESULTS

Using a general-purpose forward-time population genetics simulation environment, we developed a forward-time simulation method that can be used to simulate realistic samples for genome-wide association studies. We examined the properties of this simulation method by comparing simulated samples with real data and demonstrated its wide applicability using four examples, including a simulation of case-control samples with a disease caused by multiple interacting genetic and environmental factors, a simulation of trio families affected by a disease-predisposing allele that had been subjected to either slow or rapid selective sweep, and a simulation of a structured population resulting from recent population admixture.

CONCLUSIONS

Our algorithm simulates populations that closely resemble the complex structure of the human genome, while allows the introduction of signals of natural selection. Because of its flexibility to generate different types of samples with arbitrary disease or quantitative trait models, this simulation method can simulate realistic samples to evaluate the performance of a wide variety of statistical gene mapping methods for genome-wide association studies.

摘要

背景

正向时间模拟在模拟复杂人类疾病的遗传样本方面具有独特的优势,因为它们可以紧密模拟携带这些疾病的人类群体的进化。然而,由于一些方法学和计算方面的限制,现有的正向时间模拟方法并未充分探索这种模拟方法的优势。

结果

我们使用通用的正向时间群体遗传学模拟环境,开发了一种正向时间模拟方法,可用于模拟全基因组关联研究的真实样本。我们通过将模拟样本与真实数据进行比较来检验这种模拟方法的特性,并通过四个示例展示了其广泛的适用性,包括由多个相互作用的遗传和环境因素引起的疾病的病例对照样本的模拟、受到缓慢或快速选择压力的疾病易感等位基因的三核苷酸家庭的模拟,以及由近期群体混合引起的结构群体的模拟。

结论

我们的算法模拟的群体与人类基因组的复杂结构非常相似,同时允许引入自然选择的信号。由于其灵活性,可以生成具有任意疾病或数量性状模型的不同类型的样本,因此这种模拟方法可以模拟真实样本,以评估全基因组关联研究中各种统计基因映射方法的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82bc/2939614/ea9c1295f8df/1471-2105-11-442-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82bc/2939614/0f12f09aa03f/1471-2105-11-442-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82bc/2939614/4cf5d22e4e43/1471-2105-11-442-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82bc/2939614/43c6ac5b74ca/1471-2105-11-442-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82bc/2939614/3887381b7801/1471-2105-11-442-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82bc/2939614/ea9c1295f8df/1471-2105-11-442-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82bc/2939614/0f12f09aa03f/1471-2105-11-442-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82bc/2939614/4cf5d22e4e43/1471-2105-11-442-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82bc/2939614/43c6ac5b74ca/1471-2105-11-442-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82bc/2939614/3887381b7801/1471-2105-11-442-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/82bc/2939614/ea9c1295f8df/1471-2105-11-442-5.jpg

相似文献

1
Forward-time simulation of realistic samples for genome-wide association studies.全基因组关联研究中真实样本的正向时间模拟。
BMC Bioinformatics. 2010 Sep 1;11:442. doi: 10.1186/1471-2105-11-442.
2
Forward-time simulations of human populations with complex diseases.患有复杂疾病的人群的正向时间模拟。
PLoS Genet. 2007 Mar 23;3(3):e47. doi: 10.1371/journal.pgen.0030047. Epub 2007 Feb 15.
3
GENOMEPOP: a program to simulate genomes in populations.GENOMEPOP:一个用于模拟群体基因组的程序。
BMC Bioinformatics. 2008 Apr 30;9:223. doi: 10.1186/1471-2105-9-223.
4
A flexible forward simulator for populations subject to selection and demography.一种适用于受选择和人口统计学影响群体的灵活正向模拟器。
Bioinformatics. 2008 Dec 1;24(23):2786-7. doi: 10.1093/bioinformatics/btn522. Epub 2008 Oct 7.
5
SLiM 2: Flexible, Interactive Forward Genetic Simulations.SLiM 2:灵活、交互式正向遗传模拟。
Mol Biol Evol. 2017 Jan;34(1):230-240. doi: 10.1093/molbev/msw211. Epub 2016 Oct 3.
6
Reproducible simulations of realistic samples for next-generation sequencing studies using Variant Simulation Tools.使用变异模拟工具对下一代测序研究的真实样本进行可重复模拟。
Genet Epidemiol. 2015 Jan;39(1):45-52. doi: 10.1002/gepi.21867. Epub 2014 Nov 13.
7
GeneEvolve: a fast and memory efficient forward-time simulator of realistic whole-genome sequence and SNP data.基因进化模拟器(GeneEvolve):一款快速且内存高效的正向时间模拟器,用于模拟真实的全基因组序列和单核苷酸多态性(SNP)数据。
Bioinformatics. 2017 Jan 15;33(2):294-296. doi: 10.1093/bioinformatics/btw606. Epub 2016 Sep 21.
8
GPOPSIM: a simulation tool for whole-genome genetic data.GPOPSIM:一种用于全基因组遗传数据的模拟工具。
BMC Genet. 2015 Feb 5;16(1):10. doi: 10.1186/s12863-015-0173-4.
9
Simulating autosomal genotypes with realistic linkage disequilibrium and a spiked-in genetic effect.模拟具有真实连锁不平衡和插入遗传效应的常染色体基因型。
BMC Bioinformatics. 2018 Jan 2;19(1):2. doi: 10.1186/s12859-017-2004-2.
10
Simulating sequences of the human genome with rare variants.模拟含有罕见变异的人类基因组序列。
Hum Hered. 2010;70(4):287-91. doi: 10.1159/000323316. Epub 2011 Jan 6.

引用本文的文献

1
HAP-SAMPLE2: data-based resampling for association studies with admixture.HAP-SAMPLE2:用于混合样本关联研究的基于数据的重采样
Bioinformatics. 2025 Jun 2;41(6). doi: 10.1093/bioinformatics/btaf333.
2
Refining the scope of genetic influences on alcohol misuse through environmental stratification and gene-environment interaction.通过环境分层和基因-环境相互作用来细化遗传因素对酒精滥用影响的范围。
Alcohol Clin Exp Res (Hoboken). 2024 Oct;48(10):1853-1865. doi: 10.1111/acer.15425. Epub 2024 Aug 28.
3
Prospects for genomic surveillance for selection in schistosome parasites.

本文引用的文献

1
ESTIMATING F-STATISTICS FOR THE ANALYSIS OF POPULATION STRUCTURE.估计用于群体结构分析的F统计量
Evolution. 1984 Nov;38(6):1358-1370. doi: 10.1111/j.1558-5646.1984.tb05657.x.
2
Joint reanalysis of 29 correlated SNPs supports the role of PCLO/Piccolo as a causal risk factor for major depressive disorder.对29个相关单核苷酸多态性的联合重新分析支持PCLO/小突触泡蛋白作为重度抑郁症的一个因果风险因素的作用。
Mol Psychiatry. 2009 Jul;14(7):650-2. doi: 10.1038/mp.2009.37.
3
Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip.
血吸虫寄生虫选择的基因组监测前景。
Front Epidemiol. 2022 Sep 29;2:932021. doi: 10.3389/fepid.2022.932021. eCollection 2022.
4
Fast and Accurate Shared Segment Detection and Relatedness Estimation in Un-phased Genetic Data via TRUFFLE.通过 TRUFFLE 在非相位遗传数据中快速准确地检测共享片段和估计亲缘关系。
Am J Hum Genet. 2019 Jul 3;105(1):78-88. doi: 10.1016/j.ajhg.2019.05.007. Epub 2019 Jun 6.
5
sim1000G: a user-friendly genetic variant simulator in R for unrelated individuals and family-based designs.sim1000G:一个用于无关个体和基于家系设计的 R 语言中易于使用的遗传变异模拟器。
BMC Bioinformatics. 2019 Jan 15;20(1):26. doi: 10.1186/s12859-019-2611-1.
6
Simulating variance heterogeneity in quantitative genome wide association studies.模拟定量全基因组关联研究中的方差异质性。
BMC Bioinformatics. 2018 Mar 21;19(Suppl 3):72. doi: 10.1186/s12859-018-2061-1.
7
Reproducible simulations of realistic samples for next-generation sequencing studies using Variant Simulation Tools.使用变异模拟工具对下一代测序研究的真实样本进行可重复模拟。
Genet Epidemiol. 2015 Jan;39(1):45-52. doi: 10.1002/gepi.21867. Epub 2014 Nov 13.
8
A whole-genome simulator capable of modeling high-order epistasis for complex disease.一种能够对复杂疾病进行高阶上位性建模的全基因组模拟器。
Genet Epidemiol. 2013 Nov;37(7):686-94. doi: 10.1002/gepi.21761. Epub 2013 Oct 1.
9
Genomic prediction in animals and plants: simulation of data, validation, reporting, and benchmarking.动植物基因组预测:数据模拟、验证、报告和基准测试。
Genetics. 2013 Feb;193(2):347-65. doi: 10.1534/genetics.112.147983. Epub 2012 Dec 5.
10
Simulating realistic genomic data with rare variants.模拟带有罕见变异的真实基因组数据。
Genet Epidemiol. 2013 Feb;37(2):163-72. doi: 10.1002/gepi.21696. Epub 2012 Nov 17.
设计全基因组关联研究:样本量、效能、填补以及基因分型芯片的选择
PLoS Genet. 2009 May;5(5):e1000477. doi: 10.1371/journal.pgen.1000477. Epub 2009 May 15.
4
Simulation of genomes: a review.基因组模拟:综述。
Curr Genomics. 2008 May;9(3):155-9. doi: 10.2174/138920208784340759.
5
Detection of disease-associated deletions in case-control studies using SNP genotypes with application to rheumatoid arthritis.在病例对照研究中利用单核苷酸多态性(SNP)基因型检测疾病相关缺失并应用于类风湿关节炎
Hum Genet. 2009 Aug;126(2):303-15. doi: 10.1007/s00439-009-0672-3. Epub 2009 May 5.
6
GLOSSI: a method to assess the association of genetic loci-sets with complex diseases.GLOSSI:一种评估基因座集与复杂疾病关联的方法。
BMC Bioinformatics. 2009 Apr 3;10:102. doi: 10.1186/1471-2105-10-102.
7
Detecting gene-environment interactions using a combined case-only and case-control approach.使用仅病例与病例对照相结合的方法检测基因-环境相互作用。
Am J Epidemiol. 2009 Feb 15;169(4):497-504. doi: 10.1093/aje/kwn339. Epub 2008 Dec 13.
8
Fast and flexible simulation of DNA sequence data.DNA序列数据的快速灵活模拟。
Genome Res. 2009 Jan;19(1):136-42. doi: 10.1101/gr.083634.108. Epub 2008 Nov 24.
9
Fregene: simulation of realistic sequence-level data in populations and ascertained samples.弗雷根:群体和确定样本中实际序列水平数据的模拟。
BMC Bioinformatics. 2008 Sep 8;9:364. doi: 10.1186/1471-2105-9-364.
10
ForSim: a tool for exploring the genetic architecture of complex traits with controlled truth.ForSim:一种用于探索具有可控真值的复杂性状遗传结构的工具。
Bioinformatics. 2008 Aug 15;24(16):1821-2. doi: 10.1093/bioinformatics/btn317. Epub 2008 Jun 19.