Suppr超能文献

利用卷积神经网络进行纯合子片段可视化及分类

Visualization of Runs of Homozygosity and Classification Using Convolutional Neural Networks.

作者信息

Bakoev Siroj, Kolosova Maria, Romanets Timofey, Bakoev Faridun, Kolosov Anatoly, Romanets Elena, Korobeinikova Anna, Bakoeva Ilona, Akhmedli Vagif, Getmantseva Lyubov

机构信息

Faculty of Biotechnology, Don State Agrarian University, Persianovsky 346493, Russia.

Academy of Biology and Biotechnology Named After D. I. Ivanovsky, Southern Federal University, Rostov-on-Don 344006, Russia.

出版信息

Biology (Basel). 2025 Apr 16;14(4):426. doi: 10.3390/biology14040426.

Abstract

Runs of homozygosity (ROH) are key elements of the genetic structure of populations, reflecting inbreeding levels, selection history, and potential associations with phenotypic traits. This study proposes a novel approach to ROH analysis through visualization and classification using convolutional neural networks (CNNs). Genetic data from Large White (n = 568) and Duroc (n = 600) pigs were used to construct ROH maps, where each homozygous segment was classified by length and visualized as a color-coded image. The analysis was conducted in two stages: (1) classification of animals by breed based on ROH maps and (2) identification of the presence or absence of a phenotypic trait (limb defects). Genotyping was performed using the GeneSeek GGP SNP80x1_XT chip (Illumina Inc., San Diego, CA, USA), and ROH segments were identified using the software tool PLINK v1.9. To visualize individual maps, we utilized a modified function from the HandyCNV package. The results showed that the CNN model achieved 100% accuracy, sensitivity, and specificity in classifying pig breeds based on ROH maps. When analyzing the binary trait (presence or absence of limb defects), the model demonstrated an accuracy of 78.57%. Despite the moderate accuracy in predicting the phenotypic trait, the high negative predictive value (84.62%) indicates the model's reliability in identifying healthy animals. This method can be applied not only in animal breeding research but also in medicine to study the association between ROH and hereditary diseases. Future plans include expanding the method to other types of genetic data and developing mechanisms to improve the interpretability of deep learning models.

摘要

纯合子连续片段(ROH)是种群遗传结构的关键要素,反映了近交水平、选择历史以及与表型性状的潜在关联。本研究提出了一种通过使用卷积神经网络(CNN)进行可视化和分类来分析ROH的新方法。利用大白猪(n = 568)和杜洛克猪(n = 600)的遗传数据构建ROH图谱,其中每个纯合子片段按长度分类并可视化为彩色编码图像。分析分两个阶段进行:(1)基于ROH图谱按品种对动物进行分类;(2)识别表型性状(肢体缺陷)的有无。使用GeneSeek GGP SNP80x1_XT芯片(美国加利福尼亚州圣地亚哥的Illumina公司)进行基因分型,并使用软件工具PLINK v1.9识别ROH片段。为了可视化个体图谱,我们利用了HandyCNV软件包中的一个修改函数。结果表明,CNN模型在基于ROH图谱对猪品种进行分类时,准确率、灵敏度和特异性均达到100%。在分析二元性状(有无肢体缺陷)时,该模型的准确率为78.57%。尽管在预测表型性状方面准确率中等,但较高的阴性预测值(84.62%)表明该模型在识别健康动物方面具有可靠性。这种方法不仅可以应用于动物育种研究,还可以应用于医学领域,以研究ROH与遗传性疾病之间的关联。未来计划包括将该方法扩展到其他类型的遗传数据,并开发提高深度学习模型可解释性的机制。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b0f9/12025119/316996fab4d5/biology-14-00426-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验