• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

HGNC助手:人类和小鼠无效基因符号的识别与校正

HGNChelper: identification and correction of invalid gene symbols for human and mouse.

作者信息

Oh Sehyun, Abdelnabi Jasmine, Al-Dulaimi Ragheed, Aggarwal Ayush, Ramos Marcel, Davis Sean, Riester Markus, Waldron Levi

机构信息

Epidemiology and Biostatistics, Graduate School of Public Health and Health Policy, City University of New York, New York, 10027, USA.

Institute for Implementation Science and Population Health, New York, 10027, USA.

出版信息

F1000Res. 2020 Dec 21;9:1493. doi: 10.12688/f1000research.28033.2. eCollection 2020.

DOI:10.12688/f1000research.28033.2
PMID:33564398
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7856679/
Abstract

Gene symbols are recognizable identifiers for gene names but are unstable and error-prone due to aliasing, manual entry, and unintentional conversion by spreadsheets to date format. Official gene symbol resources such as HUGO Gene Nomenclature Committee (HGNC) for human genes and the Mouse Genome Informatics project (MGI) for mouse genes provide authoritative sources of valid, aliased, and outdated symbols, but lack a programmatic interface and correction of symbols converted by spreadsheets. We present HGNChelper, an R package that identifies known aliases and outdated gene symbols based on the HGNC human and MGI mouse gene symbol databases, in addition to common mislabeling introduced by spreadsheets, and provides corrections where possible. HGNChelper identified invalid gene symbols in the most recent Molecular Signatures Database (MSigDB 7.0) and in platform annotation files of the Gene Expression Omnibus, with prevalence ranging from ~3% in recent platforms to 30-40% in the earliest platforms from 2002-03. HGNChelper is installable from CRAN.

摘要

基因符号是基因名称的可识别标识符,但由于别名、手动输入以及电子表格无意中转换为日期格式,它们不稳定且容易出错。官方基因符号资源,如用于人类基因的人类基因命名委员会(HGNC)和用于小鼠基因的小鼠基因组信息学项目(MGI),提供了有效、别名和过时符号的权威来源,但缺乏编程接口,也无法纠正电子表格转换的符号。我们展示了HGNChelper,这是一个R包,它基于HGNC人类和MGI小鼠基因符号数据库,识别已知别名和过时的基因符号,以及电子表格引入的常见错误标注,并尽可能提供更正。HGNChelper在最新的分子特征数据库(MSigDB 7.0)和基因表达综合数据库的平台注释文件中识别出无效基因符号,其发生率从近期平台的约3%到2002 - 2003年最早平台的30 - 40%不等。HGNChelper可从CRAN安装。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8598/9184926/a4ef3c6feefb/f1000research-9-133588-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8598/9184926/a4ef3c6feefb/f1000research-9-133588-g0000.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8598/9184926/a4ef3c6feefb/f1000research-9-133588-g0000.jpg

相似文献

1
HGNChelper: identification and correction of invalid gene symbols for human and mouse.HGNC助手:人类和小鼠无效基因符号的识别与校正
F1000Res. 2020 Dec 21;9:1493. doi: 10.12688/f1000research.28033.2. eCollection 2020.
2
RGST - Rat Gene Symbol Tracker, a database for defining official rat gene symbols.RGST - 大鼠基因符号追踪器,一个用于定义官方大鼠基因符号的数据库。
BMC Genomics. 2008 Jan 23;9:29. doi: 10.1186/1471-2164-9-29.
3
Genenames.org: the HGNC and VGNC resources in 2017.Genenames.org:2017年的HGNC和VGNC资源。
Nucleic Acids Res. 2017 Jan 4;45(D1):D619-D625. doi: 10.1093/nar/gkw1033. Epub 2016 Oct 30.
4
Genenames.org: the HGNC and VGNC resources in 2021.Genenames.org:2021 年的 HGNC 和 VGNC 资源。
Nucleic Acids Res. 2021 Jan 8;49(D1):D939-D946. doi: 10.1093/nar/gkaa980.
5
Genenames.org: the HGNC resources in 2023.Genenames.org:2023 年的 HGNC 资源。
Nucleic Acids Res. 2023 Jan 6;51(D1):D1003-D1009. doi: 10.1093/nar/gkac888.
6
Genew: the human gene nomenclature database.Genew:人类基因命名数据库。
Nucleic Acids Res. 2002 Jan 1;30(1):169-71. doi: 10.1093/nar/30.1.169.
7
A standardised nomenclature for long non-coding RNAs.长非编码 RNA 的标准化命名法。
IUBMB Life. 2023 May;75(5):380-389. doi: 10.1002/iub.2663. Epub 2022 Jul 26.
8
Murine allele and transgene symbols: ensuring unique, concise, and informative nomenclature.鼠等位基因和转基因符号:确保命名法具有唯一性、简洁性和信息性。
Mamm Genome. 2022 Mar;33(1):108-119. doi: 10.1007/s00335-021-09902-3. Epub 2021 Aug 14.
9
genenames.org: the HGNC resources in 2011.基因名称组织:2011年的HGNC资源。
Nucleic Acids Res. 2011 Jan;39(Database issue):D514-9. doi: 10.1093/nar/gkq892. Epub 2010 Oct 6.
10
Genenames.org: the HGNC resources in 2015.Genenames.org:2015年的HGNC资源。
Nucleic Acids Res. 2015 Jan;43(Database issue):D1079-85. doi: 10.1093/nar/gku1071. Epub 2014 Oct 31.

引用本文的文献

1
Considerations and Software for Successful Immune Cell Deconvolution Using Proteomics Data.使用蛋白质组学数据成功进行免疫细胞反卷积的注意事项和软件
J Proteome Res. 2025 Aug 1;24(8):3751-3761. doi: 10.1021/acs.jproteome.4c00868. Epub 2025 Jul 14.
2
GDF15 reprograms the microenvironment to drive the development of uveal melanoma liver metastases.生长分化因子15(GDF15)重塑微环境以驱动葡萄膜黑色素瘤肝转移的发展。
bioRxiv. 2025 May 10:2025.05.07.652654. doi: 10.1101/2025.05.07.652654.
3
Epigenetic Changes Regulating Epithelial-Mesenchymal Plasticity in Human Trophoblast Differentiation.
调控人滋养层细胞分化过程中上皮-间质可塑性的表观遗传变化
Cells. 2025 Jun 24;14(13):970. doi: 10.3390/cells14130970.
4
Explainable Machine Learning Identifies Factors for Dosage Compensation in Aneuploid Human Cancer Cells.可解释的机器学习识别非整倍体人类癌细胞中剂量补偿的因素。
bioRxiv. 2025 May 13:2025.05.12.653427. doi: 10.1101/2025.05.12.653427.
5
Analysis of the cross-study replicability of tuberculosis gene signatures using 49 curated human transcriptomic datasets.使用49个经过整理的人类转录组数据集分析结核病基因特征的跨研究可重复性。
Tuberculosis (Edinb). 2025 Jul;153:102649. doi: 10.1016/j.tube.2025.102649. Epub 2025 May 8.
6
Evaluating transcriptional alterations associated with ageing and developing age prediction models based on the human blood transcriptome.评估与衰老相关的转录变化,并基于人类血液转录组开发年龄预测模型。
Biogerontology. 2025 Apr 4;26(2):86. doi: 10.1007/s10522-025-10216-z.
7
DGCR8 haploinsufficiency leads to primate-specific RNA dysregulation and pluripotency defects.DGCR8单倍剂量不足导致灵长类动物特有的RNA失调和多能性缺陷。
Nucleic Acids Res. 2025 Mar 20;53(6). doi: 10.1093/nar/gkaf197.
8
Preclinical studies and transcriptome analysis in a model of Parkinson's disease with dopaminergic ZNF746 expression.在具有多巴胺能ZNF746表达的帕金森病模型中的临床前研究和转录组分析。
Mol Neurodegener. 2025 Feb 28;20(1):24. doi: 10.1186/s13024-025-00814-3.
9
Two subtle problems with overrepresentation analysis.过度代表性分析存在的两个细微问题。
Bioinform Adv. 2024 Oct 21;4(1):vbae159. doi: 10.1093/bioadv/vbae159. eCollection 2024.
10
Epigenetic changes regulating the epithelial-mesenchymal transition in human trophoblast differentiation.调控人滋养层细胞分化过程中上皮-间质转化的表观遗传变化。
bioRxiv. 2024 Jul 4:2024.07.02.601748. doi: 10.1101/2024.07.02.601748.