Suppr超能文献

基于二核苷酸组成表示的深度学习预测脊柱侧弯相关的原纤蛋白-1基因型。

Dinucleotide composition representation -based deep learning to predict scoliosis-associated Fibrillin-1 genotypes.

作者信息

Zhang Sen, Dai Li-Na, Yin Qi, Kang Xiao-Ping, Zeng Dan-Dan, Jiang Tao, Zhao Guang-Yu, Li Xiao-He, Li Jing

机构信息

State Key Laboratory of Pathogen and Biosecurity, Academy of Military Medical Sciences, Beijing, China.

College of Basic Medical Sciences, Inner Mongolia Medical University, Hohhot, China.

出版信息

Front Genet. 2024 Oct 22;15:1492226. doi: 10.3389/fgene.2024.1492226. eCollection 2024.

Abstract

INTRODUCTION

Scoliosis is a pathological spine structure deformation, predominantly classified as "idiopathic" due to its unknown etiology. However, it has been suggested that scoliosis may be linked to polygenic backgrounds. It is crucial to identify potential Adolescent Idiopathic Scoliosis (AIS)-related genetic backgrounds before scoliosis onset.

METHODS

The present study was designed to intelligently parse, decompose and predict AIS-related variants in ClinVar database. Possible AIS-related variant records downloaded from ClinVar were parsed for various labels, decomposed for Dinucleotide Compositional Representation (DCR) and other traits, screened for high-risk genes with statistical analysis, and then learned intelligently with deep learning to predict high-risk AIS genotypes.

RESULTS

Results demonstrated that the present framework is composed of all technical sections of data parsing, scoliosis genotyping, genome encoding, machine learning (ML)/deep learning (DL) and scoliosis genotype predicting. 58,000 scoliosis-related records were automatically parsed and statistically analyzed for high-risk genes and genotypes, such as , and . All variant genes were decomposed for DCR and other traits. Unsupervised ML indicated marked inter-group separation and intra-group clustering of the DCR of , or for the five types of variants (Pathogenic, Pathogeniclikely, Benign, Benignlikely and Uncertain). A FBN1 DCR-based Convolutional Neural Network (CNN) was trained for Pathogenic and Benign/ Benignlikely variants performed accurately on validation data and predicted 179 high-risk scoliosis variants. The trained predictor was interpretable for the similar distribution of variant types and variant locations within 2D structure units in the predicted 3D structure of .

DISCUSSION

In summary, scoliosis risk is predictable by deep learning based on genomic decomposed features of DCR. DCR-based classifier has predicted more scoliosis risk variants in ClinVar database. DCR-based models would be promising for genotype-to-phenotype prediction for more disease types.

摘要

引言

脊柱侧弯是一种病理性脊柱结构变形,由于病因不明,主要归类为“特发性”。然而,有人提出脊柱侧弯可能与多基因背景有关。在脊柱侧弯发病前识别潜在的青少年特发性脊柱侧弯(AIS)相关基因背景至关重要。

方法

本研究旨在智能解析、分解和预测ClinVar数据库中与AIS相关的变异。从ClinVar下载的可能与AIS相关的变异记录针对各种标签进行解析,针对二核苷酸组成表示(DCR)和其他特征进行分解,通过统计分析筛选高危基因,然后使用深度学习进行智能学习以预测高危AIS基因型。

结果

结果表明,当前框架由数据解析、脊柱侧弯基因分型、基因组编码、机器学习(ML)/深度学习(DL)和脊柱侧弯基因型预测的所有技术部分组成。自动解析了58,000条与脊柱侧弯相关的记录,并对高危基因和基因型进行了统计分析,例如 、 和 。所有变异基因都针对DCR和其他特征进行了分解。无监督机器学习表明,对于五种变异类型(致病性、可能致病性、良性、可能良性和不确定), 、 或 的DCR在组间有明显分离,组内有聚类。基于FBN1 DCR的卷积神经网络(CNN)针对致病性和良性/可能良性变异进行训练,在验证数据上表现准确,并预测了179个高危脊柱侧弯变异。经过训练的预测器对于预测的 3D 结构中二维结构单元内变异类型和变异位置的相似分布是可解释的。

讨论

总之,基于DCR基因组分解特征的深度学习可预测脊柱侧弯风险。基于DCR的分类器在ClinVar数据库中预测了更多脊柱侧弯风险变异。基于DCR的模型对于更多疾病类型的基因型到表型预测可能很有前景。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f8c/11534654/282029045088/fgene-15-1492226-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验