• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

无关个体缺失基因型数据的多重填补

Multiple imputation of missing genotype data for unrelated individuals.

作者信息

Souverein O W, Zwinderman A H, Tanck M W T

机构信息

Department of Clinical Epidemiology and Biostatistics, Academic Medical Center, Amsterdam, the Netherlands.

出版信息

Ann Hum Genet. 2006 May;70(Pt 3):372-81. doi: 10.1111/j.1529-8817.2005.00236.x.

DOI:10.1111/j.1529-8817.2005.00236.x
PMID:16674559
Abstract

The objective of this study was to investigate the performance of multiple imputation of missing genotype data for unrelated individuals using the polytomous logistic regression model, focusing on different missingness mechanisms, percentages of missing data, and imputation models. A complete dataset of 581 individuals, each analysed for eight biallelic polymorphisms and the quantitative phenotype HDL-C, was used. From this dataset one hundred replicates with missing data were created, in different ways for different scenarios. The performance was assessed by comparing the mean bias in parameter estimates, the root mean squared standard errors, and the genotype-imputation error rates. Overall, the mean bias was small in all scenarios, and in most scenarios the mean did not differ significantly from 'no bias'. Including polymorphisms that are highly correlated in the imputation model reduced the genotype-imputation error rate and increased precision of the parameter estimates. The method works well for data that are missing completely at random, and for data that are missing at random. In conclusion, our results indicate that multiple imputation with the polytomous logistic regression model can be used for association studies to deal with the problem of missing genotype data, when attention is paid to the imputation model and the percentage of missing data.

摘要

本研究的目的是使用多分类逻辑回归模型,针对无关个体缺失的基因型数据,研究多重填补的性能,重点关注不同的缺失机制、数据缺失百分比和填补模型。使用了一个包含581个个体的完整数据集,每个个体都针对8个双等位基因多态性和定量表型高密度脂蛋白胆固醇(HDL-C)进行了分析。从该数据集中,针对不同场景以不同方式创建了100个带有缺失数据的重复数据集。通过比较参数估计中的平均偏差、均方根标准误差和基因型填补错误率来评估性能。总体而言,在所有场景中平均偏差都很小,并且在大多数场景中平均值与“无偏差”没有显著差异。在填补模型中纳入高度相关的多态性可降低基因型填补错误率并提高参数估计的精度。该方法对于完全随机缺失的数据以及随机缺失的数据都适用。总之,我们的结果表明,当关注填补模型和数据缺失百分比时,使用多分类逻辑回归模型进行多重填补可用于关联研究,以处理缺失基因型数据的问题。

相似文献

1
Multiple imputation of missing genotype data for unrelated individuals.无关个体缺失基因型数据的多重填补
Ann Hum Genet. 2006 May;70(Pt 3):372-81. doi: 10.1111/j.1529-8817.2005.00236.x.
2
Dealing with missing data in a multi-question depression scale: a comparison of imputation methods.处理多问题抑郁量表中的缺失数据:插补方法比较
BMC Med Res Methodol. 2006 Dec 13;6:57. doi: 10.1186/1471-2288-6-57.
3
The influence of missing value imputation on detection of differentially expressed genes from microarray data.缺失值插补对从微阵列数据中检测差异表达基因的影响。
Bioinformatics. 2005 Dec 1;21(23):4272-9. doi: 10.1093/bioinformatics/bti708. Epub 2005 Oct 10.
4
Combined linkage and association mapping of quantitative trait Loci with missing completely at random genotype data.对完全随机缺失基因型数据的数量性状位点进行连锁和关联联合定位。
Behav Genet. 2008 May;38(3):316-36. doi: 10.1007/s10519-008-9194-3. Epub 2008 Feb 27.
5
[Multiple imputation of missing at random data: General points and presentation of a Monte-Carlo method].[随机缺失数据的多重填补:一般要点及一种蒙特卡罗方法的介绍]
Rev Epidemiol Sante Publique. 2009 Oct;57(5):361-72. doi: 10.1016/j.respe.2009.04.011. Epub 2009 Aug 11.
6
Missing data on the Center for Epidemiologic Studies Depression Scale: a comparison of 4 imputation techniques.流行病学研究中心抑郁量表的缺失数据:4种插补技术的比较
Res Social Adm Pharm. 2007 Mar;3(1):1-27. doi: 10.1016/j.sapharm.2006.04.001.
7
Imputation of missing values is superior to complete case analysis and the missing-indicator method in multivariable diagnostic research: a clinical example.在多变量诊断研究中,缺失值插补优于完全病例分析和缺失指标法:一个临床实例。
J Clin Epidemiol. 2006 Oct;59(10):1102-9. doi: 10.1016/j.jclinepi.2006.01.015. Epub 2006 Jul 11.
8
Evaluating the ability of tree-based methods and logistic regression for the detection of SNP-SNP interaction.评估基于树的方法和逻辑回归检测单核苷酸多态性(SNP)-SNP相互作用的能力。
Ann Hum Genet. 2009 May;73(Pt 3):360-9. doi: 10.1111/j.1469-1809.2009.00511.x. Epub 2009 Mar 8.
9
Missing phenotype data imputation in pedigree data analysis.系谱数据分析中的缺失表型数据插补
Genet Epidemiol. 2008 Jan;32(1):52-60. doi: 10.1002/gepi.20261.
10
Propensity score estimation with missing values using a multiple imputation missingness pattern (MIMP) approach.使用多重填补缺失模式(MIMP)方法对缺失值进行倾向得分估计。
Stat Med. 2009 Apr 30;28(9):1402-14. doi: 10.1002/sim.3549.

引用本文的文献

1
Comparison of Cholangiocarcinoma and Hepatocellular Carcinoma Incidence Trends from 1993 to 2012 in Lampang, Thailand.比较 1993 年至 2012 年在泰国廊开府胆管癌和肝细胞癌发病率趋势。
Int J Environ Res Public Health. 2022 Aug 3;19(15):9551. doi: 10.3390/ijerph19159551.
2
Fast and Scalable Private Genotype Imputation Using Machine Learning and Partially Homomorphic Encryption.使用机器学习和部分同态加密实现快速且可扩展的私密基因型插补
IEEE Access. 2021;9:93097-93110. doi: 10.1109/access.2021.3093005. Epub 2021 Jun 28.
3
Practical data handling pipeline improves performance of qPCR-based circulating miRNA measurements.
实用的数据处理流程提高了基于定量聚合酶链反应的循环微小核糖核酸测量的性能。
RNA. 2017 May;23(5):811-821. doi: 10.1261/rna.059063.116. Epub 2017 Feb 15.
4
The use of a multiple imputation method to investigate the trends in Histologic types of lung cancer in Songkhla province, Thailand, 1989-2013.1989年至2013年期间,泰国宋卡府采用多重填补法研究肺癌组织学类型的趋势。
BMC Cancer. 2016 Jul 4;16:389. doi: 10.1186/s12885-016-2441-8.
5
On the performance of multiple imputation based on chained equations in tackling missing data of the African α3.7 -globin deletion in a malaria association study.基于链式方程的多重填补在疟疾关联研究中处理非洲α3.7 -珠蛋白缺失的缺失数据方面的性能。
Ann Hum Genet. 2014 Jul;78(4):277-89. doi: 10.1111/ahg.12065.
6
Joint modelling rationale for chained equations.联立方程的联合建模原理。
BMC Med Res Methodol. 2014 Feb 21;14:28. doi: 10.1186/1471-2288-14-28.
7
Probability genotype imputation method and integrated weighted lasso for QTL identification.概率基因型推断方法和集成加权套索用于 QTL 鉴定。
BMC Genet. 2013 Dec 30;14:125. doi: 10.1186/1471-2156-14-125.
8
A pharmacokinetic/pharmacodynamic model of tumor lysis syndrome in chronic lymphocytic leukemia patients treated with flavopiridol.氟维司群治疗慢性淋巴细胞白血病患者肿瘤溶解综合征的药代动力学/药效学模型。
Clin Cancer Res. 2013 Mar 1;19(5):1269-80. doi: 10.1158/1078-0432.CCR-12-1092. Epub 2013 Jan 8.
9
Genotype determination for polymorphisms in linkage disequilibrium.连锁不平衡中多态性的基因型测定。
BMC Bioinformatics. 2009 Feb 20;10:63. doi: 10.1186/1471-2105-10-63.
10
An ensemble learning approach jointly modeling main and interaction effects in genetic association studies.一种在基因关联研究中联合对主效应和交互效应进行建模的集成学习方法。
Genet Epidemiol. 2008 May;32(4):285-300. doi: 10.1002/gepi.20304.