Department of Gastroenterology, Renmin Hospital of Wuhan University, Wuhan, Hubei, 430060, P. R. China.
Hubei Key Laboratory of Digestive System, Renmin Hospital of Wuhan University, Wuhan, Hubei, 430060, P. R. China.
Adv Sci (Weinh). 2024 Nov;11(44):e2405395. doi: 10.1002/advs.202405395. Epub 2024 Oct 7.
Decoding gene regulatory networks is essential for understanding the mechanisms underlying many complex diseases. GENET is developed, an automated system designed to extract and visualize extensive molecular relationships from published biomedical literature. Using natural language processing, entities and relations are identified from a randomly selected set of 1788 scientific articles, and visualized in a filterable knowledge graph. The performance of GENET is evaluated and compared with existing methods. The named entity recognition model has achieved an overall precision of 94.23% (4835/5131; 93.56-94.84%), recall of 97.72% (4835/4948; 97.27-98.10%), and an F1 score of 95.94%. The relation extraction model has demonstrated an overall precision of 91.63% (2593/2830; 90.55-92.59%), recall of 89.17% (2593/2908; 87.99-90.25%), and an F1 score of 90.38%. GENET significantly outperforms existing methods in extracting molecular relationships (P < 0.001). Additionally, GENET has successfully predicted WNT family member 4 regulates insulin-like growth factor 2 via signal transducer and activator of transcription 3 in colon cancer. With RNA sequencing data and multiple immunofluorescence, the authenticity of this prediction is validated, supporting the promising feasibility of GENET.
解码基因调控网络对于理解许多复杂疾病的机制至关重要。本文开发了一个名为 GENET 的自动化系统,旨在从已发表的生物医学文献中提取和可视化广泛的分子关系。该系统使用自然语言处理技术,从随机选择的 1788 篇科学文章中识别实体和关系,并以可过滤的知识图谱形式可视化。评估并比较了 GENET 的性能与现有方法。命名实体识别模型的整体精度为 94.23%(4835/5131;93.56-94.84%),召回率为 97.72%(4835/4948;97.27-98.10%),F1 得分为 95.94%。关系提取模型的整体精度为 91.63%(2593/2830;90.55-92.59%),召回率为 89.17%(2593/2908;87.99-90.25%),F1 得分为 90.38%。GENET 在提取分子关系方面明显优于现有方法(P<0.001)。此外,GENET 成功预测了 WNT 家族成员 4 通过信号转导和转录激活因子 3 调节结肠癌中的胰岛素样生长因子 2。使用 RNA 测序数据和多种免疫荧光技术,验证了这一预测的真实性,支持了 GENET 的有前途的可行性。