• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

综合分析单核苷酸多态性和基因表达可有效地区分来自密切相关种族群体的样本。

Integrative analysis of single nucleotide polymorphisms and gene expression efficiently distinguishes samples from closely related ethnic populations.

机构信息

Institute of Statistical Science, Academia Sinica, Taipei 115, Taiwan.

出版信息

BMC Genomics. 2012 Jul 28;13:346. doi: 10.1186/1471-2164-13-346.

DOI:10.1186/1471-2164-13-346
PMID:22839760
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3453505/
Abstract

BACKGROUND

Ancestry informative markers (AIMs) are a type of genetic marker that is informative for tracing the ancestral ethnicity of individuals. Application of AIMs has gained substantial attention in population genetics, forensic sciences, and medical genetics. Single nucleotide polymorphisms (SNPs), the materials of AIMs, are useful for classifying individuals from distinct continental origins but cannot discriminate individuals with subtle genetic differences from closely related ancestral lineages. Proof-of-principle studies have shown that gene expression (GE) also is a heritable human variation that exhibits differential intensity distributions among ethnic groups. GE supplies ethnic information supplemental to SNPs; this motivated us to integrate SNP and GE markers to construct AIM panels with a reduced number of required markers and provide high accuracy in ancestry inference. Few studies in the literature have considered GE in this aspect, and none have integrated SNP and GE markers to aid classification of samples from closely related ethnic populations.

RESULTS

We integrated a forward variable selection procedure into flexible discriminant analysis to identify key SNP and/or GE markers with the highest cross-validation prediction accuracy. By analyzing genome-wide SNP and/or GE markers in 210 independent samples from four ethnic groups in the HapMap II Project, we found that average testing accuracies for a majority of classification analyses were quite high, except for SNP-only analyses that were performed to discern study samples containing individuals from two close Asian populations. The average testing accuracies ranged from 0.53 to 0.79 for SNP-only analyses and increased to around 0.90 when GE markers were integrated together with SNP markers for the classification of samples from closely related Asian populations. Compared to GE-only analyses, integrative analyses of SNP and GE markers showed comparable testing accuracies and a reduced number of selected markers in AIM panels.

CONCLUSIONS

Integrative analysis of SNP and GE markers provides high-accuracy and/or cost-effective classification results for assigning samples from closely related or distantly related ancestral lineages to their original ancestral populations. User-friendly BIASLESS (Biomarkers Identification and Samples Subdivision) software was developed as an efficient tool for selecting key SNP and/or GE markers and then building models for sample subdivision. BIASLESS was programmed in R and R-GUI and is available online at http://www.stat.sinica.edu.tw/hsinchou/genetics/prediction/BIASLESS.htm.

摘要

背景

祖先信息标记物(AIMs)是一种遗传标记物,可用于追踪个体的祖先种族。AIMs 在群体遗传学、法医学和医学遗传学中得到了广泛关注。单核苷酸多态性(SNP)是 AIMs 的材料,对于将来自不同大陆起源的个体进行分类很有用,但无法区分来自密切相关的祖先谱系的个体之间的细微遗传差异。原理验证研究表明,基因表达(GE)也是一种可遗传的人类变异,在族群之间表现出不同的强度分布。GE 提供了 SNP 之外的种族信息;这促使我们整合 SNP 和 GE 标记物,构建所需标记物数量较少的 AIM 面板,并提供高精度的祖先推断。文献中很少有研究考虑到这一方面的 GE,也没有将 SNP 和 GE 标记物整合起来,以帮助对来自密切相关的种族群体的样本进行分类。

结果

我们将一个正向变量选择过程整合到灵活判别分析中,以确定具有最高交叉验证预测准确性的关键 SNP 和/或 GE 标记物。通过分析来自 HapMap II 项目的四个族群的 210 个独立样本的全基因组 SNP 和/或 GE 标记物,我们发现,大多数分类分析的平均测试准确率都相当高,除了 SNP 仅分析,用于辨别包含来自两个亚洲近缘人群的个体的研究样本。SNP 仅分析的平均测试准确率为 0.53 至 0.79,当将 GE 标记物与 SNP 标记物一起整合用于分类来自亚洲近缘人群的样本时,准确率增加到 0.90 左右。与 GE 仅分析相比,SNP 和 GE 标记物的综合分析显示出可比的测试准确率和 AIM 面板中选择的标记物数量减少。

结论

SNP 和 GE 标记物的综合分析为将来自密切相关或远缘祖先谱系的样本分配到其原始祖先群体提供了高精度和/或具有成本效益的分类结果。用户友好的 BIASLESS(生物标志物识别和样本细分)软件已被开发为一种有效的工具,用于选择关键的 SNP 和/或 GE 标记物,然后构建样本细分模型。BIASLESS 是用 R 和 R-GUI 编写的,并可在 http://www.stat.sinica.edu.tw/hsinchou/genetics/prediction/BIASLESS.htm 上在线获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3616/3453505/50337ad9597b/1471-2164-13-346-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3616/3453505/ed93166ee093/1471-2164-13-346-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3616/3453505/59ba2955f072/1471-2164-13-346-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3616/3453505/78592e5a12eb/1471-2164-13-346-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3616/3453505/9fdf2b9fa5f7/1471-2164-13-346-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3616/3453505/50337ad9597b/1471-2164-13-346-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3616/3453505/ed93166ee093/1471-2164-13-346-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3616/3453505/59ba2955f072/1471-2164-13-346-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3616/3453505/78592e5a12eb/1471-2164-13-346-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3616/3453505/9fdf2b9fa5f7/1471-2164-13-346-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3616/3453505/50337ad9597b/1471-2164-13-346-5.jpg

相似文献

1
Integrative analysis of single nucleotide polymorphisms and gene expression efficiently distinguishes samples from closely related ethnic populations.综合分析单核苷酸多态性和基因表达可有效地区分来自密切相关种族群体的样本。
BMC Genomics. 2012 Jul 28;13:346. doi: 10.1186/1471-2164-13-346.
2
An ancestry informative marker panel design for individual ancestry estimation of Hispanic population using whole exome sequencing data.基于全外显子组测序数据的西班牙裔个体祖籍信息标记面板设计用于个体祖籍估计。
BMC Genomics. 2019 Dec 30;20(Suppl 12):1007. doi: 10.1186/s12864-019-6333-6.
3
Straightforward inference of ancestry and admixture proportions through ancestry-informative insertion deletion multiplexing.通过基于祖先信息的插入缺失多重PCR 直接推断祖先和混合比例。
PLoS One. 2012;7(1):e29684. doi: 10.1371/journal.pone.0029684. Epub 2012 Jan 17.
4
AIM-SNPtag: A computationally efficient approach for developing ancestry-informative SNP panels.AIM-SNPtag:一种用于开发具有遗传背景信息的 SNP 面板的计算高效方法。
Forensic Sci Int Genet. 2019 Jan;38:245-253. doi: 10.1016/j.fsigen.2018.10.015. Epub 2018 Nov 2.
5
Ancestry informative SNP panels for discriminating the major East Asian populations: Han Chinese, Japanese and Korean.用于区分东亚主要人群(汉族、日本人和韩国人)的祖先信息性单核苷酸多态性(SNP)面板。
Ann Hum Genet. 2019 Sep;83(5):348-354. doi: 10.1111/ahg.12320. Epub 2019 Apr 26.
6
Applying genome-wide gene-based expression quantitative trait locus mapping to study population ancestry and pharmacogenetics.应用全基因组基于基因的表达定量性状位点定位来研究群体血统和药物遗传学。
BMC Genomics. 2014 Apr 29;15:319. doi: 10.1186/1471-2164-15-319.
7
Forensic Characterization and Genetic Portrait of the Gannan Tibetan Ethnic Group via 165 AI-SNP Loci.基于 165 个 AI-SNP 位点的甘肃藏族人群法医特征及遗传特征分析。
Front Biosci (Landmark Ed). 2023 Jun 14;28(6):114. doi: 10.31083/j.fbl2806114.
8
SAQC: SNP array quality control.SAQC:SNP 芯片质量控制。
BMC Bioinformatics. 2011 Apr 18;12:100. doi: 10.1186/1471-2105-12-100.
9
Geography and genography: prediction of continental origin using randomly selected single nucleotide polymorphisms.地理与基因地理学:利用随机选择的单核苷酸多态性预测大陆起源
BMC Genomics. 2007 Mar 10;8:68. doi: 10.1186/1471-2164-8-68.
10
Biogeographic origin prediction of three continental populations through 42 ancestry informative SNPs.通过 42 个祖先信息 SNP 预测三个大陆人群的生物地理起源。
Electrophoresis. 2020 Feb;41(3-4):235-245. doi: 10.1002/elps.201900241. Epub 2019 Nov 29.

引用本文的文献

1
Genetic ancestry plays a central role in population pharmacogenomics.遗传背景在群体药物基因组学中起着核心作用。
Commun Biol. 2021 Feb 5;4(1):171. doi: 10.1038/s42003-021-01681-6.
2
Vertical integration methods for gene expression data analysis.基因表达数据分析的垂直整合方法。
Brief Bioinform. 2021 May 20;22(3). doi: 10.1093/bib/bbaa169.
3
Clinical performance of DNA-based prenatal screening using single-nucleotide polymorphisms approach in Thai women with singleton pregnancy.泰国单胎妊娠孕妇应用基于单核苷酸多态性的 DNA 产前筛查的临床性能。

本文引用的文献

1
Improving human forensics through advances in genetics, genomics and molecular biology.通过遗传学、基因组学和分子生物学的进步来改善人类法医学。
Nat Rev Genet. 2011 Mar;12(3):179-92. doi: 10.1038/nrg2952.
2
Analyses of a set of 128 ancestry informative single-nucleotide polymorphisms in a global set of 119 population samples.对全球119个群体样本中的一组128个祖先信息单核苷酸多态性进行分析。
Investig Genet. 2011 Jan 5;2(1):1. doi: 10.1186/2041-2223-2-1.
3
CoAIMs: a cost-effective panel of ancestry informative markers for determining continental origins.
Mol Genet Genomic Med. 2020 Jul;8(7):e1256. doi: 10.1002/mgg3.1256. Epub 2020 Apr 24.
4
Establishment of two basal-like breast cancer cell lines with extremely low tumorigenicity from Taiwanese premenopausal women.从台湾绝经前女性中建立具有极低致瘤性的两个基底样乳腺癌细胞系。
Hum Cell. 2018 Apr;31(2):154-166. doi: 10.1007/s13577-017-0197-3. Epub 2018 Feb 26.
5
Genetic signatures of heroin addiction.海洛因成瘾的基因特征。
Medicine (Baltimore). 2016 Aug;95(31):e4473. doi: 10.1097/MD.0000000000004473.
6
Exploring transcriptomic diversity in muscle revealed that cellular signaling pathways mainly differentiate five Western porcine breeds.对肌肉转录组多样性的研究表明,细胞信号通路是区分五个西方猪品种的主要因素。
BMC Genomics. 2015 Dec 12;16:1055. doi: 10.1186/s12864-015-2259-9.
7
Applying genome-wide gene-based expression quantitative trait locus mapping to study population ancestry and pharmacogenetics.应用全基因组基于基因的表达定量性状位点定位来研究群体血统和药物遗传学。
BMC Genomics. 2014 Apr 29;15:319. doi: 10.1186/1471-2164-15-319.
8
VNN1 gene expression levels and the G-137T polymorphism are associated with HDL-C levels in Mexican prepubertal children.VNN1 基因表达水平和 G-137T 多态性与墨西哥青春期前儿童的 HDL-C 水平相关。
PLoS One. 2012;7(11):e49818. doi: 10.1371/journal.pone.0049818. Epub 2012 Nov 21.
CoAIMs:一种经济有效的祖先信息标记物面板,用于确定大陆起源。
PLoS One. 2010 Oct 15;5(10):e13443. doi: 10.1371/journal.pone.0013443.
4
Epigenetic modifications and human disease.表观遗传学修饰与人类疾病。
Nat Biotechnol. 2010 Oct;28(10):1057-68. doi: 10.1038/nbt.1685.
5
Ancestry informative markers for fine-scale individual assignment to worldwide populations.用于全球人群精细个体归属的祖先信息标记。
J Med Genet. 2010 Dec;47(12):835-47. doi: 10.1136/jmg.2010.078212. Epub 2010 Oct 4.
6
Inferring geographic coordinates of origin for Europeans using small panels of ancestry informative markers.利用小面板的祖先信息标记推断欧洲人的原籍地理坐标。
PLoS One. 2010 Aug 18;5(8):e11892. doi: 10.1371/journal.pone.0011892.
7
A new analysis tool for individual-level allele frequency for genomic studies.用于基因组研究的个体等位基因频率的新分析工具。
BMC Genomics. 2010 Jul 5;11:415. doi: 10.1186/1471-2164-11-415.
8
Missing heritability and strategies for finding the underlying causes of complex disease.复杂疾病遗传率缺失及其潜在病因的研究策略。
Nat Rev Genet. 2010 Jun;11(6):446-50. doi: 10.1038/nrg2809.
9
Finding the missing heritability of complex diseases.寻找复杂疾病中缺失的遗传力。
Nature. 2009 Oct 8;461(7265):747-53. doi: 10.1038/nature08494.
10
A low-cost, high-throughput, automated single nucleotide polymorphism assay for forensic human DNA applications.一种用于法医人类DNA应用的低成本、高通量、自动化单核苷酸多态性检测方法。
Anal Biochem. 2009 Dec 1;395(1):61-7. doi: 10.1016/j.ab.2009.07.041. Epub 2009 Jul 30.