Suppr超能文献

一个将面部表型、基因与罕见遗传病相联系的可解释数据集。

An explainable dataset linking facial phenotypes and genes to rare genetic diseases.

作者信息

Song Jie, He Mengqiao, Ren Shumin, Shen Bairong

机构信息

Department of Ophthalmology and Institutes for Systems Genetics, Frontiers Science Center for Disease-related Molecular Network, West China Hospital, Sichuan University, Chengdu, 610212, China.

出版信息

Sci Data. 2025 Apr 15;12(1):634. doi: 10.1038/s41597-025-04922-z.

Abstract

Distinctive facial phenotypes serve as crucial diagnostic markers for many rare genetic diseases. Although AI-driven image recognition achieves high diagnostic accuracy, it often fails to explain its predictions. In this study, we present the Facial phenotype-Gene-Disease Dataset (FGDD), an explainable dataset collected from 509 research publications. It contains 1,147 data records encompassing 197 disease-causing genes, 437 facial phenotypes, and 211 disease entities, with 689 records having disease labels. Each data record represents a patient group and includes demographic information, variation information, and phenotype information. Baseline and explainability validations conducted on FGDD confirmed the dataset's effectiveness. FGDD supports the training of diagnostic models for rare genetic diseases while delivering explainable results, and provides a foundation for exploring intricate connections between genes, diseases, and facial phenotypes.

摘要

独特的面部表型是许多罕见遗传病的关键诊断标志物。尽管人工智能驱动的图像识别具有很高的诊断准确性,但它往往无法解释其预测结果。在本研究中,我们展示了面部表型-基因-疾病数据集(FGDD),这是一个从509篇研究文献中收集的可解释数据集。它包含1147条数据记录,涵盖197个致病基因、437种面部表型和211种疾病实体,其中689条记录有疾病标签。每条数据记录代表一个患者群体,包括人口统计学信息、变异信息和表型信息。对FGDD进行的基线和可解释性验证证实了该数据集的有效性。FGDD支持训练罕见遗传病的诊断模型,同时提供可解释的结果,并为探索基因、疾病和面部表型之间的复杂联系奠定了基础。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0426/12000290/eddd1bc11060/41597_2025_4922_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验