Suppr超能文献

[人类白细胞抗原单倍型和人类白细胞抗原基因型预测模型的建立与验证]

[Establishment and validation of prediction models for human leukocyte antigen haplotypes and human leukocyte antigen genotypes].

作者信息

Li Y, Du D, Zhang T T, Han Y, Song Y, Yuan X N, Bao X J, He J

机构信息

Department of HLA Laboratory, Jiangsu Institute of Hematology, the First Affiliated Hospital of Soochow University, Suzhou 215031, China.

China Marrow Donor Program, Beijing 100010, China.

出版信息

Zhonghua Yi Xue Za Zhi. 2024 Mar 19;104(11):834-842. doi: 10.3760/cma.j.cn112137-20231130-01246.

Abstract

To establish prediction models for human leukocyte antigen (HLA) haplotypes and HLA genotypes, and verify the prediction accuracy. The prediction models were established based on the characteristic of HLA haplotype inheritance and linkage disequilibrium (LD), as well as the invention patents and software copyrights obtained. The models include algorithm and reference databases such as HLA A-C-B-DRB1-DQB1 high-resolution haplotypes database, B-C and DRB1-DQB1 LD database, G group alleles table, and NMDP Code alleles table. The prediction algorithm involves data processing, comparison with reference data, filtering results, probability calculation and ranking, confidence degree estimation, and output of prediction results. The accuracy of the predictions was verified by comparing them with the correct results, and the relationship between prediction accuracy and the probability distribution and confidence degree of the predicted results was analyzed. The HLA haplotypes and genotypes prediction models were established. The prediction algorithm included the prediction of A-C-B-DRB1-DQB1 haplotypes according to HLA-A, B, DRB1, C, DQB1 genotypes, the prediction of C and DQB1 high-resolution results according to A, B and DRB1 high-resolution results, and the prediction of A, B, DRB1, C and DQB1 high resolution results according to the A, B and DRB1 intermediate or low resolution results. Validation results of "Predicting A-C-B-DRB1-DQB1 haplotypes basing on HLA-A, B, DRB1, C, DQB1 genotypes" model: for 787 data, the accuracy was 94.0% (740/787) with 740 correct predictions, 34 incorrect predictions, and 13 instances with no predicted results. For 847 data, the accuracy was 100% (847/847). The 2 411 and 2 594 haplotype combinations predicted from 787 and 847 data were grouped according to confidence degree, the accuracy was 100% (48/48, 114/114) for a confidence degree of 1, 96.2% (303/315) and 97.8% (409/418) for a confidence degree of 2 respectively. Validation results of "Predicting A, B, DRB1 and C, DQB1 high-resolution genotypes basing on HLA-A, B, DRB1 high, intermediate, or low resolution genotypes" model: when predicting C and DQB1 high resolution genotypes basing on A, B, and DRB1 high resolution genotypes, 89.3% (1 459/1 634) of the predictions were correct. The accuracy for the top 2 predicted probability (GPP) ranking was 79.2% (1 156/1 459), and for the top 10, it was 95.0% (1 386/1 459). Furthermore, when GPP≥90% and GPP 50%-90%, the prediction accuracy was 81.3% (209/257) and 72.8% (447/614) respectively. The accuracy of predicting C and DQB1 high resolution genotypes basing on the results of A, B, and DRB1 high resolution genotypes from the China Marrow Donor Program was 87.0% (20/23). The accuracy of predicting A, B, DRB1, C, and DQB1 high resolution genotypes basing on the results of A, B, and DRB1 intermediate or low-resolution genotypes was 70.0% (7/10) and 52.5% (21/40) respectively. When predicting whether the patient is likely to have a HLA 10/10 matched donor, the accuracy of the top 2 GPP combinations with a proportion of ≥50% was 85.7% (6/7). When using A, B, DRB1, C, DQB1 genotypes to predict A-C-B-DRB1-DQB1 haplotype combinations, the results with a confidence degree of 1 and 2 are reliable. When predicting C and DQB1 genotypes according to A, B and DRB1 genotypes, the top 10 results ranked by GPP are reliable, and the top 2 results with GPP≥50% are more reliable.

摘要

建立人类白细胞抗原(HLA)单倍型和基因型的预测模型,并验证预测准确性。基于HLA单倍型遗传特征和连锁不平衡(LD)以及已获得的发明专利和软件著作权建立预测模型。这些模型包括算法和参考数据库,如HLA A - C - B - DRB1 - DQB1高分辨率单倍型数据库、B - C和DRB1 - DQB1 LD数据库、G组等位基因表以及NMDP编码等位基因表。预测算法包括数据处理、与参考数据比较、结果筛选、概率计算与排序、置信度估计以及预测结果输出。通过将预测结果与正确结果进行比较来验证预测的准确性,并分析预测准确性与预测结果的概率分布和置信度之间的关系。建立了HLA单倍型和基因型预测模型。预测算法包括根据HLA - A、B、DRB1、C、DQB1基因型预测A - C - B - DRB1 - DQB1单倍型,根据A、B和DRB1高分辨率结果预测C和DQB1高分辨率结果,以及根据A、B和DRB1中等或低分辨率结果预测A、B、DRB1、C和DQB1高分辨率结果。“基于HLA - A、B、DRB1、C、DQB1基因型预测A - C - B - DRB1 - DQB1单倍型”模型的验证结果:对于787个数据,准确率为94.0%(740/787),有740个正确预测、34个错误预测以及13个无预测结果的实例。对于847个数据,准确率为100%(847/847)。从787个和847个数据预测得到的2411个和2594个单倍型组合按置信度分组,置信度为1时准确率为100%(48/48,114/114),置信度为2时分别为96.2%(303/315)和97.8%(409/418)。“基于HLA - A、B、DRB1高、中、低分辨率基因型预测A、B、DRB1和C、DQB1高分辨率基因型”模型的验证结果:当根据A、B和DRB1高分辨率基因型预测C和DQB1高分辨率基因型时,89.3%(1459/1634)的预测正确。预测概率前2(GPP)排名的准确率为79.2%(1156/1459),前10的准确率为95.0%(1386/1459)。此外,当GPP≥90%和GPP 50% - 90%时,预测准确率分别为81.3%(209/257)和72.8%(447/614)。中国造血干细胞捐献者资料库中根据A、B和DRB1高分辨率基因型结果预测C和DQB1高分辨率基因型的准确率为87.0%(20/23)。根据A、B和DRB1中等或低分辨率基因型结果预测A、B、DRB1、C和DQB1高分辨率基因型的准确率分别为70.0%(7/10)和52.5%(21/40)。当预测患者是否可能有HLA 10/10匹配供者时,GPP比例≥50%的前2组合的准确率为85.7%(6/7)。当使用A、B、DRB1、C、DQB1基因型预测A - C - B - DRB1 - DQB1单倍型组合时,置信度为1和2的结果可靠。当根据A、B和DRB1基因型预测C和DQB1基因型时,按GPP排名前10的结果可靠,GPP≥50%的前2结果更可靠。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验