Suppr超能文献

基因库1.1:一种总结来自NCBI基因数据集数据的工具及其在人类基因统计更新中的应用。

GeneBase 1.1: a tool to summarize data from NCBI gene datasets and its application to an update of human gene statistics.

作者信息

Piovesan Allison, Caracausi Maria, Antonaros Francesca, Pelleri Maria Chiara, Vitale Lorenza

机构信息

Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Via Belmeloro 8, 40126 Bologna, Italy.

Department of Experimental, Diagnostic and Specialty Medicine (DIMES), Unit of Histology, Embryology and Applied Biology, University of Bologna, Via Belmeloro 8, 40126 Bologna, Italy

出版信息

Database (Oxford). 2016 Dec 26;2016. doi: 10.1093/database/baw153. Print 2016.

Abstract

We release GeneBase 1.1, a local tool with a graphical interface useful for parsing, structuring and indexing data from the National Center for Biotechnology Information (NCBI) Gene data bank. Compared to its predecessor GeneBase (1.0), GeneBase 1.1 now allows dynamic calculation and summarization in terms of median, mean, standard deviation and total for many quantitative parameters associated with genes, gene transcripts and gene features (exons, introns, coding sequences, untranslated regions). GeneBase 1.1 thus offers the opportunity to perform analyses of the main gene structure parameters also following the search for any set of genes with the desired characteristics, allowing unique functionalities not provided by the NCBI Gene itself. In order to show the potential of our tool for local parsing, structuring and dynamic summarizing of publicly available databases for data retrieval, analysis and testing of biological hypotheses, we provide as a sample application a revised set of statistics for human nuclear genes, gene transcripts and gene features. In contrast with previous estimations strongly underestimating the length of human genes, a 'mean' human protein-coding gene is 67 kbp long, has eleven 309 bp long exons and ten 6355 bp long introns. Median, mean and extreme values are provided for many other features offering an updated reference source for human genome studies, data useful to set parameters for bioinformatic tools and interesting clues to the biomedical meaning of the gene features themselves.Database URL: http://apollo11.isto.unibo.it/software/.

摘要

我们发布了GeneBase 1.1,这是一个带有图形界面的本地工具,可用于解析、构建来自美国国立生物技术信息中心(NCBI)基因数据库的数据并为其建立索引。与之前的版本GeneBase(1.0)相比,GeneBase 1.1现在允许对与基因、基因转录本和基因特征(外显子、内含子、编码序列、非翻译区)相关的许多定量参数进行中位数、均值、标准差和总和的动态计算与汇总。因此,GeneBase 1.1还提供了在搜索任何具有所需特征的基因集之后,对主要基因结构参数进行分析的机会,具有NCBI Gene本身所没有的独特功能。为了展示我们的工具在本地解析、构建和动态汇总公共可用数据库以进行数据检索、分析和生物学假设测试方面的潜力,我们提供了一组经过修订的人类核基因、基因转录本和基因特征的统计数据作为示例应用。与之前严重低估人类基因长度的估计不同,一个“平均”的人类蛋白质编码基因长度为67 kbp,有11个长度为309 bp的外显子和10个长度为6355 bp的内含子。还提供了许多其他特征的中位数、均值和极值,为人类基因组研究提供了一个更新的参考来源,为生物信息学工具设置参数提供了有用数据,并为基因特征本身的生物医学意义提供了有趣线索。数据库网址:http://apollo11.isto.unibo.it/software/

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2385/5199132/dc43786d83f9/baw153f1p.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验