Department of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, UK.
Department of Energy and Power Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China.
Sci Rep. 2021 Jul 27;11(1):15269. doi: 10.1038/s41598-021-94742-z.
Autism is a spectrum disorder with wide variation in type and severity of symptoms. Understanding gene-phenotype associations is vital to unravel the disease mechanisms and advance its diagnosis and treatment. To date, several databases have stored a large portion of gene-phenotype associations which are mainly obtained from genetic experiments. However, a large proportion of gene-phenotype associations are still buried in the autism-related literature and there are limited resources to investigate autism-associated gene-phenotype associations. Given the abundance of the autism-related literature, we were thus motivated to develop Autism_genepheno, a text mining pipeline to identify sentence-level mentions of autism-associated genes and phenotypes in literature through natural language processing methods. We have generated a comprehensive database of gene-phenotype associations in the last five years' autism-related literature that can be easily updated as new literature becomes available. We have evaluated our pipeline through several different approaches, and we are able to rank and select top autism-associated genes through their unique and wide spectrum of phenotypic profiles, which could provide a unique resource for the diagnosis and treatment of autism. The data resources and the Autism_genpheno pipeline are available at: https://github.com/maiziezhoulab/Autism_genepheno .
自闭症是一种谱系障碍,其症状的类型和严重程度差异很大。了解基因-表型的关联对于揭示疾病机制、推进其诊断和治疗至关重要。迄今为止,已有几个数据库存储了大量的基因-表型关联,这些关联主要是从遗传实验中获得的。然而,很大一部分基因-表型关联仍然埋藏在自闭症相关文献中,用于研究自闭症相关基因-表型关联的资源有限。鉴于自闭症相关文献的丰富性,我们因此开发了 Autism_genepheno,这是一个文本挖掘管道,通过自然语言处理方法从文献中识别与自闭症相关的基因和表型的句子级提及。我们已经生成了过去五年自闭症相关文献中基因-表型关联的综合数据库,随着新文献的出现,该数据库可以轻松更新。我们通过几种不同的方法评估了我们的管道,我们能够通过它们独特且广泛的表型特征对自闭症相关基因进行排名和选择,这可能为自闭症的诊断和治疗提供独特的资源。数据资源和 Autism_genpheno 管道可在:https://github.com/maiziezhoulab/Autism_genepheno 获得。