• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

KEGG_PULL:一个用于通过 RESTful 访问和从京都基因与基因组百科全书(KEGG)中提取数据的软件包。

kegg_pull: a software package for the RESTful access and pulling from the Kyoto Encyclopedia of Gene and Genomes.

机构信息

Markey Cancer Center, University of Kentucky, Lexington, KY, 40536, USA.

Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, KY, 40536, USA.

出版信息

BMC Bioinformatics. 2023 Mar 4;24(1):78. doi: 10.1186/s12859-023-05208-0.

DOI:10.1186/s12859-023-05208-0
PMID:36870946
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9985241/
Abstract

BACKGROUND

The Kyoto Encyclopedia of Genes and Genomes (KEGG) provides organized genomic, biomolecular, and metabolic information and knowledge that is reasonably current and highly useful for a wide range of analyses and modeling. KEGG follows the principles of data stewardship to be findable, accessible, interoperable, and reusable (FAIR) by providing RESTful access to their database entries via their web-accessible KEGG API. However, the overall FAIRness of KEGG is often limited by the library and software package support available in a given programming language. While R library support for KEGG is fairly strong, Python library support has been lacking. Moreover, there is no software that provides extensive command line level support for KEGG access and utilization.

RESULTS

We present kegg_pull, a package implemented in the Python programming language that provides better KEGG access and utilization functionality than previous libraries and software packages. Not only does kegg_pull include an application programming interface (API) for Python programming, it also provides a command line interface (CLI) that enables utilization of KEGG for a wide range of shell scripting and data analysis pipeline use-cases. As kegg_pull's name implies, both the API and CLI provide versatile options for pulling (downloading and saving) an arbitrary (user defined) number of database entries from the KEGG API. Moreover, this functionality is implemented to efficiently utilize multiple central processing unit cores as demonstrated in several performance tests. Many options are provided to optimize fault-tolerant performance across a single or multiple processes, with recommendations provided based on extensive testing and practical network considerations.

CONCLUSIONS

The new kegg_pull package enables new flexible KEGG retrieval use cases not available in previous software packages. The most notable new feature that kegg_pull provides is its ability to robustly pull an arbitrary number of KEGG entries with a single API method or CLI command, including pulling an entire KEGG database. We provide recommendations to users for the most effective use of kegg_pull according to their network and computational circumstances.

摘要

背景

京都基因与基因组百科全书(KEGG)提供了组织化的基因组、生物分子和代谢信息与知识,这些信息和知识既具有时效性,又非常有助于进行广泛的分析和建模。KEGG 遵循数据管理原则,通过其可通过网络访问的 KEGG API 以 RESTful 方式访问其数据库条目,从而实现可查找、可访问、可互操作和可重复使用(FAIR)。然而,KEGG 的整体 FAIR 程度通常受到给定编程语言中可用的库和软件包支持的限制。虽然 R 库对 KEGG 的支持相当强大,但 Python 库的支持却一直不足。此外,没有软件提供广泛的命令行级支持来访问和利用 KEGG。

结果

我们提出了 kegg_pull,这是一个用 Python 编程语言实现的软件包,它提供了比以前的库和软件包更好的 KEGG 访问和利用功能。kegg_pull 不仅包括用于 Python 编程的应用程序编程接口(API),还提供了命令行界面(CLI),使 KEGG 能够用于广泛的 shell 脚本和数据分析管道用例。正如 kegg_pull 的名称所暗示的,API 和 CLI 都提供了从 KEGG API 下载和保存任意(用户定义)数量的数据库条目的多功能选项。此外,此功能的实现可有效地利用多个中央处理单元内核,这在多个性能测试中得到了证明。提供了许多选项来优化单个或多个进程的容错性能,并根据广泛的测试和实际网络考虑因素提供了建议。

结论

新的 kegg_pull 软件包支持以前的软件包中不可用的新的灵活的 KEGG 检索用例。kegg_pull 提供的最显著的新功能是其能够使用单个 API 方法或 CLI 命令可靠地提取任意数量的 KEGG 条目,包括提取整个 KEGG 数据库。我们根据用户的网络和计算情况为用户提供了使用 kegg_pull 的最有效建议。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f2be/9985241/f580071efbef/12859_2023_5208_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f2be/9985241/0788319884ed/12859_2023_5208_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f2be/9985241/57c58cfd7fab/12859_2023_5208_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f2be/9985241/f580071efbef/12859_2023_5208_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f2be/9985241/0788319884ed/12859_2023_5208_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f2be/9985241/57c58cfd7fab/12859_2023_5208_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f2be/9985241/f580071efbef/12859_2023_5208_Fig3_HTML.jpg

相似文献

1
kegg_pull: a software package for the RESTful access and pulling from the Kyoto Encyclopedia of Gene and Genomes.KEGG_PULL:一个用于通过 RESTful 访问和从京都基因与基因组百科全书(KEGG)中提取数据的软件包。
BMC Bioinformatics. 2023 Mar 4;24(1):78. doi: 10.1186/s12859-023-05208-0.
2
PyCellBase, an efficient python package for easy retrieval of biological data from heterogeneous sources.PyCellBase,一个高效的 Python 包,用于轻松从异构数据源中检索生物数据。
BMC Bioinformatics. 2019 Mar 28;20(1):159. doi: 10.1186/s12859-019-2726-4.
3
A fast and efficient python library for interfacing with the Biological Magnetic Resonance Data Bank.一个用于与生物磁共振数据库接口的快速高效的Python库。
BMC Bioinformatics. 2017 Mar 17;18(1):175. doi: 10.1186/s12859-017-1580-5.
4
JASPAR RESTful API: accessing JASPAR data from any programming language.JASPAR RESTful API:从任何编程语言访问 JASPAR 数据。
Bioinformatics. 2018 May 1;34(9):1612-1614. doi: 10.1093/bioinformatics/btx804.
5
Unipept CLI 2.0: adding support for visualizations and functional annotations.Unipept CLI 2.0:添加可视化和功能注释支持。
Bioinformatics. 2020 Aug 15;36(14):4220-4221. doi: 10.1093/bioinformatics/btaa553.
6
A Python library for FAIRer access and deposition to the Metabolomics Workbench Data Repository.一个用于更公平地访问和存入代谢组学工作台数据存储库的Python库。
Metabolomics. 2018;14(5):64. doi: 10.1007/s11306-018-1356-6. Epub 2018 Apr 20.
7
KNeXT: a NetworkX-based topologically relevant KEGG parser.KNeXT:一种基于NetworkX的与拓扑相关的KEGG解析器。
Front Genet. 2024 Feb 13;15:1292394. doi: 10.3389/fgene.2024.1292394. eCollection 2024.
8
Pygenomics: manipulating genomic intervals and data files in Python.Pygenomics:在 Python 中操作基因组区间和数据文件。
Bioinformatics. 2023 Jun 1;39(6). doi: 10.1093/bioinformatics/btad346.
9
Xconnector: Retrieving and visualizing metabolites and pathways information from various database resources.X 连接器:从各种数据库资源中检索和可视化代谢物和途径信息。
J Proteomics. 2021 Aug 15;245:104302. doi: 10.1016/j.jprot.2021.104302. Epub 2021 Jun 8.
10
Using EMBL-EBI Services via Web Interface and Programmatically via Web Services.通过 Web 界面和通过 Web 服务进行 EMBL-EBI 服务的使用。
Curr Protoc. 2024 Jun;4(6):e1065. doi: 10.1002/cpz1.1065.

引用本文的文献

1
Information-Content-Informed Kendall-tau Correlation Methodology: Interpreting Missing Values as Useful Information.信息内容告知的肯德尔tau相关性方法:将缺失值解释为有用信息。
bioRxiv. 2025 Jul 21:2022.02.24.481854. doi: 10.1101/2022.02.24.481854.
2
Chemical representation standardization needed to generalize metabolic pathway involvement prediction across the Kyoto Encyclopedia of Genes and Genomes, Reactome, and MetaCyc knowledgebases.需要进行化学表示标准化,以便在《京都基因与基因组百科全书》、Reactome和MetaCyc知识库中推广代谢途径参与预测。
bioRxiv. 2025 Apr 8:2025.04.02.646918. doi: 10.1101/2025.04.02.646918.
3

本文引用的文献

1
KEGG: integrating viruses and cellular organisms.KEGG:整合病毒和细胞生物。
Nucleic Acids Res. 2021 Jan 8;49(D1):D545-D551. doi: 10.1093/nar/gkaa970.
2
Toward understanding the origin and evolution of cellular organisms.为了理解细胞生物的起源和进化。
Protein Sci. 2019 Nov;28(11):1947-1951. doi: 10.1002/pro.3715. Epub 2019 Sep 9.
3
The FAIR Guiding Principles for scientific data management and stewardship.科学数据管理和保存的 FAIR 指导原则。
Integrated skin metabolomics and network pharmacology to explore the mechanisms of Goupi Plaster for treating knee osteoarthritis.
整合皮肤代谢组学与网络药理学以探究狗皮膏治疗膝骨关节炎的机制
J Tradit Complement Med. 2024 Apr 12;14(6):675-686. doi: 10.1016/j.jtcme.2024.04.004. eCollection 2024 Nov.
4
Investigating Angiogenesis-Related Biomarkers in Osteoarthritis Patients Through Transcriptomic Profiling.通过转录组分析研究骨关节炎患者中与血管生成相关的生物标志物
J Inflamm Res. 2024 Dec 8;17:10681-10697. doi: 10.2147/JIR.S493889. eCollection 2024.
5
Predicting the Pathway Involvement of All Pathway and Associated Compound Entries Defined in the Kyoto Encyclopedia of Genes and Genomes.预测《京都基因与基因组百科全书》中定义的所有通路及相关化合物条目的通路参与情况。
Metabolites. 2024 Oct 27;14(11):582. doi: 10.3390/metabo14110582.
6
Predicting the Association of Metabolites with Both Pathway Categories and Individual Pathways.预测代谢物与通路类别及单个通路之间的关联。
Metabolites. 2024 Sep 21;14(9):510. doi: 10.3390/metabo14090510.
7
Predicting the Pathway Involvement of Metabolites in Both Pathway Categories and Individual Pathways.预测代谢物在通路类别和单个通路中的通路参与情况。
bioRxiv. 2024 Aug 9:2024.08.07.607025. doi: 10.1101/2024.08.07.607025.
8
ATP6V0A1-dependent cholesterol absorption in colorectal cancer cells triggers immunosuppressive signaling to inactivate memory CD8 T cells.ATP6V0A1 依赖性胆固醇吸收可触发结直肠癌细胞中的免疫抑制信号,从而使记忆性 CD8 T 细胞失活。
Nat Commun. 2024 Jul 6;15(1):5680. doi: 10.1038/s41467-024-50077-7.
9
Unraveling the role of and in oxidative stress: A pathway to therapeutic interventions in cerebral aneurysms.解析[具体物质]和[具体物质]在氧化应激中的作用:一条通向脑动脉瘤治疗干预的途径。 (注:原文中两个“and”后内容缺失,翻译时用[具体物质]表示)
Biomol Biomed. 2025 Jan 14;25(2):360-374. doi: 10.17305/bb.2024.10510.
10
Patulin Biodegradation Mechanism Study in S15-8 Based on PgSDR-A5D9S1.基于 PgSDR-A5D9S1 的 S15-8 中棒曲霉素生物降解机制研究
Toxins (Basel). 2024 Apr 4;16(4):177. doi: 10.3390/toxins16040177.
Sci Data. 2016 Mar 15;3:160018. doi: 10.1038/sdata.2016.18.
4
Biopython: freely available Python tools for computational molecular biology and bioinformatics.Biopython:用于计算分子生物学和生物信息学的免费可用Python工具。
Bioinformatics. 2009 Jun 1;25(11):1422-3. doi: 10.1093/bioinformatics/btp163. Epub 2009 Mar 20.
5
The Kyoto encyclopedia of genes and genomes--KEGG.京都基因与基因组百科全书——KEGG
Yeast. 2000 Apr;17(1):48-55. doi: 10.1002/(SICI)1097-0061(200004)17:1<48::AID-YEA2>3.0.CO;2-H.