一种在聚类标识符存在模糊性时基于似然的混合建模方法。

A likelihood-based approach to mixed modeling with ambiguity in cluster identifiers.

作者信息

Foulkes Andrea S, Yucel Recai, Li Xiaohong

机构信息

Division of Biostatistics, School of Public Health and Health Sciences, University of Massachusetts, Amherst, MA, USA.

出版信息

Biostatistics. 2008 Oct;9(4):635-57. doi: 10.1093/biostatistics/kxm055. Epub 2008 Mar 14.

DOI:10.1093/biostatistics/kxm055

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2536727/

Abstract

This manuscript describes a novel, linear mixed-effects model-fitting technique for the setting in which correlated data indicators are not completely observed. Mixed modeling is a useful analytical tool for characterizing genotype-phenotype associations among multiple potentially informative genetic loci. This approach involves grouping individuals into genetic clusters, where individuals in the same cluster have similar or identical multilocus genotypes. In haplotype-based investigations of unrelated individuals, corresponding cluster assignments are unobservable since the alignment of alleles within chromosomal copies is not generally observed. We derive an expectation conditional maximization approach to estimation in the mixed modeling setting, where cluster assignments are ambiguous. The approach has broad relevance to the analysis of data with missing correlated data identifiers. An example is provided based on data arising from a cohort of human immunodeficiency virus type-1-infected individuals at risk for antiretroviral therapy-associated dyslipidemia.

摘要

本手稿描述了一种新颖的线性混合效应模型拟合技术，用于处理相关数据指标未被完全观测到的情况。混合建模是一种有用的分析工具，用于表征多个潜在信息丰富的基因座之间的基因型-表型关联。这种方法涉及将个体分组到基因簇中，同一簇中的个体具有相似或相同的多位点基因型。在基于单倍型的无关个体研究中，由于通常无法观察到染色体拷贝内等位基因的排列，相应的簇分配是不可观测的。我们推导了一种期望条件最大化方法，用于在簇分配不明确的混合建模环境中进行估计。该方法与具有缺失相关数据标识符的数据的分析具有广泛的相关性。基于来自一组有抗逆转录病毒治疗相关血脂异常风险的人类免疫缺陷病毒1型感染个体的数据提供了一个示例。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cbad/2638560/49a072034a82/biostskxm055f01_lw.jpg

相似文献

1

A likelihood-based approach to mixed modeling with ambiguity in cluster identifiers.

Biostatistics. 2008 Oct;9(4):635-57. doi: 10.1093/biostatistics/kxm055. Epub 2008 Mar 14.

2

A resampling-based approach to multiple testing with uncertainty in phase.

Int J Biostat. 2007;3(1):Article 2. doi: 10.2202/1557-4679.1037.

3

Latent variable modeling paradigms for genotype-trait association studies.

Biom J. 2011 Sep;53(5):838-54. doi: 10.1002/bimj.201000218.

4

Multiple imputation and random forests (MIRF) for unobservable, high-dimensional data.

Int J Biostat. 2007;3(1):Article 12. doi: 10.2202/1557-4679.1049.

5

Accuracy of haplotype frequency estimation for biallelic loci, via the expectation-maximization algorithm for unphased diploid genotype data.

Am J Hum Genet. 2000 Oct;67(4):947-59. doi: 10.1086/303069. Epub 2000 Aug 22.

6

Estimation of haplotype frequencies, linkage-disequilibrium measures, and combination of haplotype copies in each pool by use of pooled DNA data.

Am J Hum Genet. 2003 Feb;72(2):384-98. doi: 10.1086/346116. Epub 2003 Jan 17.

7

Notes on the maximum likelihood estimation of haplotype frequencies.

Ann Hum Genet. 2004 May;68(Pt 3):257-64. doi: 10.1046/j.1529-8817.2003.00088.x.

8

Conditional likelihood inference in a case- cohort design: an application to haplotype analysis.

Int J Biostat. 2007;3(1):Article 1. doi: 10.2202/1557-4679.1021.

9

Estimating haplotype frequencies and standard errors for multiple single nucleotide polymorphisms.

Biostatistics. 2003 Oct;4(4):513-22. doi: 10.1093/biostatistics/4.4.513.

10

Multi-SNP Haplotype Analysis Methods for Association Analysis.

Methods Mol Biol. 2017;1666:485-504. doi: 10.1007/978-1-4939-7274-6_24.

引用本文的文献

1

Double-blinded, randomized, and controlled study on the effects of canagliflozin after bariatric surgery: A pilot study.

Obes Sci Pract. 2020 Mar 17;6(3):255-263. doi: 10.1002/osp4.409. eCollection 2020 Jun.

2

Mixed modeling of meta-analysis P-values (MixMAP) suggests multiple novel gene loci for low density lipoprotein cholesterol.

PLoS One. 2013;8(2):e54812. doi: 10.1371/journal.pone.0054812. Epub 2013 Feb 6.

3

Mixture modelling as an exploratory framework for genotype-trait associations.

J R Stat Soc Ser C Appl Stat. 2011 May;60(3):355-375. doi: 10.1111/j.1467-9876.2010.00750.x.

4

State of the Multiple Imputation Software.

J Stat Softw. 2011 Dec;45(1). doi: 10.18637/jss.v045.i01.

5

Latent variable modeling paradigms for genotype-trait association studies.

Biom J. 2011 Sep;53(5):838-54. doi: 10.1002/bimj.201000218.

6

Estimating and testing haplotype-trait associations in non-diploid populations.

J R Stat Soc Ser C Appl Stat. 2009 Dec;58(5):663-678. doi: 10.1111/j.1467-9876.2009.00673.x.

7

Multiple imputation inference for multivariate multilevel continuous data with ignorable non-response.

Philos Trans A Math Phys Eng Sci. 2008 Jul 13;366(1874):2389-403. doi: 10.1098/rsta.2008.0038.

本文引用的文献

1

Mixed modeling and multiple imputation for unobservable genotype clusters.

Stat Med. 2008 Jul 10;27(15):2784-801. doi: 10.1002/sim.3051.

2

Associations among race/ethnicity, ApoC-III genotypes, and lipids in HIV-1-infected individuals on antiretroviral therapy.

PLoS Med. 2006 Mar;3(3):e52. doi: 10.1371/journal.pmed.0030052.

3

Regression-based association analysis with clustered haplotypes through use of genotypes.

Am J Hum Genet. 2006 Feb;78(2):231-42. doi: 10.1086/500025. Epub 2005 Dec 19.

4

Mixed modelling to characterize genotype-phenotype associations.

Stat Med. 2005 Mar 15;24(5):775-89. doi: 10.1002/sim.1965.

5

Comparison of prospective and retrospective methods for haplotype inference in case-control studies.

Genet Epidemiol. 2004 Nov;27(3):192-201. doi: 10.1002/gepi.20020.

6

Estimation and tests of haplotype-environment interaction when linkage phase is ambiguous.

Hum Hered. 2003;55(1):56-65. doi: 10.1159/000071811.

7

Characterizing the relationship between HIV-1 genotype and phenotype: prediction-based classification.

Biometrics. 2002 Mar;58(1):145-56. doi: 10.1111/j.0006-341x.2002.00145.x.

8

Finite mixture modeling with mixture outcomes using the EM algorithm.

Biometrics. 1999 Jun;55(2):463-9. doi: 10.1111/j.0006-341x.1999.00463.x.

9

Maximum-likelihood estimation of molecular haplotype frequencies in a diploid population.

Mol Biol Evol. 1995 Sep;12(5):921-7. doi: 10.1093/oxfordjournals.molbev.a040269.

10

Random-effects models for longitudinal data.

Biometrics. 1982 Dec;38(4):963-74.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

文档翻译

学术文献翻译模型，支持多种主流文档格式。