• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

mixIndependR:一个用于在多基因座基因型数据库中测试基因座统计独立性的 R 包。

mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes.

机构信息

Department of Microbiology, Immunology and Genetics, University of North Texas Health Science Center, 3500 Camp Bowie Blvd, Fort Worth, TX, 76107, USA.

出版信息

BMC Bioinformatics. 2021 Jan 6;22(1):12. doi: 10.1186/s12859-020-03945-0.

DOI:10.1186/s12859-020-03945-0
PMID:33407074
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7788837/
Abstract

BACKGROUND

Multi-locus genotype data are widely used in population genetics and disease studies. In evaluating the utility of multi-locus data, the independence of markers is commonly considered in many genomic assessments. Generally, pairwise non-random associations are tested by linkage disequilibrium; however, the dependence of one panel might be triplet, quartet, or other. Therefore, a compatible and user-friendly software is necessary for testing and assessing the global linkage disequilibrium among mixed genetic data.

RESULTS

This study describes a software package for testing the mutual independence of mixed genetic datasets. Mutual independence is defined as no non-random associations among all subsets of the tested panel. The new R package "mixIndependR" calculates basic genetic parameters like allele frequency, genotype frequency, heterozygosity, Hardy-Weinberg equilibrium, and linkage disequilibrium (LD) by mutual independence from population data, regardless of the type of markers, such as simple nucleotide polymorphisms, short tandem repeats, insertions and deletions, and any other genetic markers. A novel method of assessing the dependence of mixed genetic panels is developed in this study and functionally analyzed in the software package. By comparing the observed distribution of two common summary statistics (the number of heterozygous loci [K] and the number of share alleles [X]) with their expected distributions under the assumption of mutual independence, the overall independence is tested.

CONCLUSION

The package "mixIndependR" is compatible to all categories of genetic markers and detects the overall non-random associations. Compared to pairwise disequilibrium, the approach described herein tends to have higher power, especially when number of markers is large. With this package, more multi-functional or stronger genetic panels can be developed, like mixed panels with different kinds of markers. In population genetics, the package "mixIndependR" makes it possible to discover more about admixture of populations, natural selection, genetic drift, and population demographics, as a more powerful method of detecting LD. Moreover, this new approach can optimize variants selection in disease studies and contribute to panel combination for treatments in multimorbidity. Application of this approach in real data is expected in the future, and this might bring a leap in the field of genetic technology.

AVAILABILITY

The R package mixIndependR, is available on the Comprehensive R Archive Network (CRAN) at: https://cran.r-project.org/web/packages/mixIndependR/index.html .

摘要

背景

多位点基因型数据广泛应用于群体遗传学和疾病研究。在评估多位点数据的效用时,许多基因组评估通常考虑标记的独立性。通常,通过连锁不平衡来测试标记之间的非随机关联;然而,一个面板的依赖性可能是三联体、四重体或其他形式。因此,需要一个兼容且用户友好的软件来测试和评估混合遗传数据的全局连锁不平衡。

结果

本研究描述了一个用于测试混合遗传数据集相互独立性的软件包。相互独立性定义为测试面板的所有子集之间没有非随机关联。新的 R 包“mixIndependR”通过相互独立性从群体数据中计算基本遗传参数,如等位基因频率、基因型频率、杂合度、哈迪-温伯格平衡和连锁不平衡(LD),而不管标记的类型如何,如简单核苷酸多态性、短串联重复、插入和缺失以及任何其他遗传标记。本研究中开发了一种评估混合遗传面板依赖性的新方法,并在软件包中进行了功能分析。通过比较两个常见汇总统计量(杂合基因座数[K]和共享等位基因数[X])的观测分布与相互独立性假设下的预期分布,检验整体独立性。

结论

该软件包“mixIndependR”与所有类别的遗传标记兼容,并检测整体非随机关联。与成对不平衡相比,本文描述的方法具有更高的功效,特别是当标记数量较大时。使用此软件包,可以开发更多多功能或更强的遗传面板,例如具有不同类型标记的混合面板。在群体遗传学中,该软件包“mixIndependR”使得发现更多关于群体混合、自然选择、遗传漂变和群体人口统计学的信息成为可能,作为检测 LD 的更强大方法。此外,这种新方法可以优化疾病研究中的变体选择,并有助于多疾病治疗中的面板组合。预计未来将在实际数据中应用这种方法,这可能会推动遗传技术领域的发展。

可用性

R 包 mixIndependR 可在 Comprehensive R Archive Network (CRAN) 上获得,网址为:https://cran.r-project.org/web/packages/mixIndependR/index.html。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/ba5779614433/12859_2020_3945_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/b1264de8d689/12859_2020_3945_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/c99e373fe8f3/12859_2020_3945_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/b661c72a93cb/12859_2020_3945_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/1280f12983ac/12859_2020_3945_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/2941aaefc0d7/12859_2020_3945_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/ba5779614433/12859_2020_3945_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/b1264de8d689/12859_2020_3945_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/c99e373fe8f3/12859_2020_3945_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/b661c72a93cb/12859_2020_3945_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/1280f12983ac/12859_2020_3945_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/2941aaefc0d7/12859_2020_3945_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/13e5/7788837/ba5779614433/12859_2020_3945_Fig6_HTML.jpg

相似文献

1
mixIndependR: a R package for statistical independence testing of loci in database of multi-locus genotypes.mixIndependR:一个用于在多基因座基因型数据库中测试基因座统计独立性的 R 包。
BMC Bioinformatics. 2021 Jan 6;22(1):12. doi: 10.1186/s12859-020-03945-0.
2
Using selection index theory to estimate consistency of multi-locus linkage disequilibrium across populations.利用选择指数理论估计多基因座连锁不平衡在不同群体间的一致性。
BMC Genet. 2015 Jul 19;16:87. doi: 10.1186/s12863-015-0252-6.
3
Pairwise linkage disequilibrium estimation for polyploids.多倍体的连锁不平衡对估计。
Mol Ecol Resour. 2021 May;21(4):1230-1242. doi: 10.1111/1755-0998.13349. Epub 2021 Mar 1.
4
sim1000G: a user-friendly genetic variant simulator in R for unrelated individuals and family-based designs.sim1000G:一个用于无关个体和基于家系设计的 R 语言中易于使用的遗传变异模拟器。
BMC Bioinformatics. 2019 Jan 15;20(1):26. doi: 10.1186/s12859-019-2611-1.
5
PyPop: a software framework for population genomics: analyzing large-scale multi-locus genotype data.PyPop:一个用于群体基因组学的软件框架:分析大规模多位点基因型数据。
Pac Symp Biocomput. 2003:514-25.
6
: An R Package for Rapidly Calculating Linkage Disequilibrium Statistics in Diverse Populations.一个用于在不同人群中快速计算连锁不平衡统计量的R软件包。
Front Genet. 2020 Feb 28;11:157. doi: 10.3389/fgene.2020.00157. eCollection 2020.
7
MIDAS: software for analysis and visualisation of interallelic disequilibrium between multiallelic markers.MIDAS:用于分析和可视化多等位基因标记间等位基因不平衡的软件。
BMC Bioinformatics. 2006 Apr 27;7:227. doi: 10.1186/1471-2105-7-227.
8
Estimating Disequilibrium Coefficients.估计不平衡系数。
Methods Mol Biol. 2017;1666:117-132. doi: 10.1007/978-1-4939-7274-6_7.
9
Simultaneous detection of linkage disequilibrium and genetic differentiation of subdivided populations.同时检测细分群体的连锁不平衡和遗传分化。
Genetics. 2004 Aug;167(4):2003-13. doi: 10.1534/genetics.103.023044.
10
Population genetic data of the 21 autosomal STRs included in GlobalFiler kit of a population sample from the Kingdom of Bahrain.巴林王国人群样本中 21 个常染色体 STR 基因座的群体遗传学数据,这些数据来自于 GlobalFiler 试剂盒。
PLoS One. 2019 Aug 15;14(8):e0220620. doi: 10.1371/journal.pone.0220620. eCollection 2019.

引用本文的文献

1
Susceptibility of different TMEM154 genotypes in three Italian sheep breeds infected by different SRLV genotypes.不同 TMEM154 基因型在感染不同 SRLV 基因型的三个意大利绵羊品种中的易感性。
Vet Res. 2022 Jul 29;53(1):60. doi: 10.1186/s13567-022-01079-0.

本文引用的文献

1
A reference haplotype panel for genome-wide imputation of short tandem repeats.全基因组短串联重复序列遗传数据推断的参考单体型面板
Nat Commun. 2018 Oct 23;9(1):4397. doi: 10.1038/s41467-018-06694-0.
2
A novel multiplex assay of SNP-STR markers for forensic purpose.用于法医目的的 SNP-STR 标记的新型多重分析。
PLoS One. 2018 Jul 18;13(7):e0200700. doi: 10.1371/journal.pone.0200700. eCollection 2018.
3
Linkage disequilibrium matches forensic genetic records to disjoint genomic marker sets.连锁不平衡将法医遗传记录与不相交的基因组标记集匹配。
Proc Natl Acad Sci U S A. 2017 May 30;114(22):5671-5676. doi: 10.1073/pnas.1619944114. Epub 2017 May 15.
4
STRs vs. SNPs: thoughts on the future of forensic DNA testing.短串联重复序列(STRs)与单核苷酸多态性(SNPs):关于法医DNA检测未来的思考
Forensic Sci Med Pathol. 2007 Sep;3(3):200-5. doi: 10.1007/s12024-007-0018-1. Epub 2007 Sep 12.
5
Development of an alfalfa SNP array and its use to evaluate patterns of population structure and linkage disequilibrium.苜蓿单核苷酸多态性(SNP)芯片的开发及其在评估群体结构和连锁不平衡模式中的应用。
PLoS One. 2014 Jan 9;9(1):e84329. doi: 10.1371/journal.pone.0084329. eCollection 2014.
6
A high-performance computing toolset for relatedness and principal component analysis of SNP data.用于 SNP 数据亲缘关系和主成分分析的高性能计算工具集。
Bioinformatics. 2012 Dec 15;28(24):3326-8. doi: 10.1093/bioinformatics/bts606. Epub 2012 Oct 11.
7
Recent human effective population size estimated from linkage disequilibrium.近期通过连锁不平衡估计的人类有效种群大小。
Genome Res. 2007 Apr;17(4):520-6. doi: 10.1101/gr.6023607. Epub 2007 Mar 9.
8
The distribution of the number of heterozygous Loci in an individual in natural populations.自然种群中个体杂合位点数量的分布。
Genetics. 1981 Jun;98(2):461-6. doi: 10.1093/genetics/98.2.461.
9
Matching and partially-matching DNA profiles.匹配和部分匹配的DNA图谱。
J Forensic Sci. 2004 Sep;49(5):1009-14.
10
Can long-range microsatellite data be used to predict short-range linkage disequilibrium?远距离微卫星数据能否用于预测短距离连锁不平衡?
Hum Mol Genet. 2002 Jun 1;11(12):1363-72. doi: 10.1093/hmg/11.12.1363.