Suppr超能文献

基于自动化多模态图的小鼠遗传发现管道。

An automated multi-modal graph-based pipeline for mouse genetic discovery.

机构信息

Department of Anesthesia, Pain and Perioperative Medicine, Stanford University School of Medicine, Stanford, CA 94305, USA.

出版信息

Bioinformatics. 2022 Jun 27;38(13):3385-3394. doi: 10.1093/bioinformatics/btac356.

Abstract

MOTIVATION

Our ability to identify causative genetic factors for mouse genetic models of human diseases and biomedical traits has been limited by the difficulties associated with identifying true causative factors, which are often obscured by the many false positive genetic associations produced by a GWAS.

RESULTS

To accelerate the pace of genetic discovery, we developed a graph neural network (GNN)-based automated pipeline (GNNHap) that could rapidly analyze mouse genetic model data and identify high probability causal genetic factors for analyzed traits. After assessing the strength of allelic associations with the strain response pattern; this pipeline analyzes 29M published papers to assess candidate gene-phenotype relationships; and incorporates the information obtained from a protein-protein interaction network and protein sequence features into the analysis. The GNN model produces markedly improved results relative to that of a simple linear neural network. We demonstrate that GNNHap can identify novel causative genetic factors for murine models of diabetes/obesity and for cataract formation, which were validated by the phenotypes appearing in previously analyzed gene knockout mice. The diabetes/obesity results indicate how characterization of the underlying genetic architecture enables new therapies to be discovered and tested by applying 'precision medicine' principles to murine models.

AVAILABILITY AND IMPLEMENTATION

The GNNHap source code is freely available at https://github.com/zqfang/gnnhap, and the new version of the HBCGM program is available at https://github.com/zqfang/haplomap.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

我们识别人类疾病和生物医学特征的小鼠遗传模型的因果遗传因素的能力受到与识别真正因果因素相关的困难的限制,这些因素通常被 GWAS 产生的许多假阳性遗传关联所掩盖。

结果

为了加速遗传发现的步伐,我们开发了一种基于图神经网络 (GNN) 的自动化管道 (GNNHap),可以快速分析小鼠遗传模型数据并识别分析性状的高概率因果遗传因素。该管道在评估等位基因与菌株反应模式之间关联的强度后;分析了 2900 万篇已发表的论文,以评估候选基因-表型关系;并将从蛋白质-蛋白质相互作用网络和蛋白质序列特征中获得的信息纳入分析。GNN 模型的表现明显优于简单线性神经网络。我们证明 GNNHap 可以识别糖尿病/肥胖和白内障形成的小鼠模型的新因果遗传因素,这些因素通过之前分析的基因敲除小鼠中出现的表型得到了验证。糖尿病/肥胖的结果表明,通过将“精准医学”原则应用于小鼠模型,如何对潜在遗传结构进行特征描述,从而能够发现和测试新的治疗方法。

可用性和实现

GNNHap 的源代码可在 https://github.com/zqfang/gnnhap 上免费获得,新版本的 HBCGM 程序可在 https://github.com/zqfang/haplomap 上获得。

补充信息

补充数据可在生物信息学在线获得。

相似文献

本文引用的文献

2
Genetics of murine type 2 diabetes and comorbidities.小鼠 2 型糖尿病及其合并症的遗传学研究。
Mamm Genome. 2022 Sep;33(3):421-436. doi: 10.1007/s00335-022-09948-x. Epub 2022 Feb 3.
3
A knowledge graph to interpret clinical proteomics data.一个解释临床蛋白质组学数据的知识图谱。
Nat Biotechnol. 2022 May;40(5):692-702. doi: 10.1038/s41587-021-01145-6. Epub 2022 Jan 31.
4
Prospects for cardiovascular medicine using artificial intelligence.人工智能在心血管医学中的应用前景。
J Cardiol. 2022 Mar;79(3):319-325. doi: 10.1016/j.jjcc.2021.10.016. Epub 2021 Nov 10.
8
A guide to machine learning for biologists.生物学机器学习指南。
Nat Rev Mol Cell Biol. 2022 Jan;23(1):40-55. doi: 10.1038/s41580-021-00407-0. Epub 2021 Sep 13.
9
The molecular principles of gene regulation by Polycomb repressive complexes.多梳抑制复合物调控基因表达的分子机制。
Nat Rev Mol Cell Biol. 2021 Dec;22(12):815-833. doi: 10.1038/s41580-021-00398-y. Epub 2021 Aug 16.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验