• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

CGKB:豇豆(Vigna unguiculata L.)甲基化过滤基因组基因空间序列的注释知识库。

CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L.) methylation filtered genomic genespace sequences.

作者信息

Chen Xianfeng, Laudeman Thomas W, Rushton Paul J, Spraggins Thomas A, Timko Michael P

机构信息

Department of Microbiology, University of Virginia Health System, Charlottesville, VA 29908, USA.

出版信息

BMC Bioinformatics. 2007 Apr 19;8:129. doi: 10.1186/1471-2105-8-129.

DOI:10.1186/1471-2105-8-129
PMID:17445272
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1868039/
Abstract

BACKGROUND

Cowpea [Vigna unguiculata (L.) Walp.] is one of the most important food and forage legumes in the semi-arid tropics because of its ability to tolerate drought and grow on poor soils. It is cultivated mostly by poor farmers in developing countries, with 80% of production taking place in the dry savannah of tropical West and Central Africa. Cowpea is largely an underexploited crop with relatively little genomic information available for use in applied plant breeding. The goal of the Cowpea Genomics Initiative (CGI), funded by the Kirkhouse Trust, a UK-based charitable organization, is to leverage modern molecular genetic tools for gene discovery and cowpea improvement. One aspect of the initiative is the sequencing of the gene-rich region of the cowpea genome (termed the genespace) recovered using methylation filtration technology and providing annotation and analysis of the sequence data.

DESCRIPTION

CGKB, Cowpea Genespace/Genomics Knowledge Base, is an annotation knowledge base developed under the CGI. The database is based on information derived from 298,848 cowpea genespace sequences (GSS) isolated by methylation filtering of genomic DNA. The CGKB consists of three knowledge bases: GSS annotation and comparative genomics knowledge base, GSS enzyme and metabolic pathway knowledge base, and GSS simple sequence repeats (SSRs) knowledge base for molecular marker discovery. A homology-based approach was applied for annotations of the GSS, mainly using BLASTX against four public FASTA formatted protein databases (NCBI GenBank Proteins, UniProtKB-Swiss-Prot, UniprotKB-PIR (Protein Information Resource), and UniProtKB-TrEMBL). Comparative genome analysis was done by BLASTX searches of the cowpea GSS against four plant proteomes from Arabidopsis thaliana, Oryza sativa, Medicago truncatula, and Populus trichocarpa. The possible exons and introns on each cowpea GSS were predicted using the HMM-based Genscan gene predication program and the potential domains on annotated GSS were analyzed using the HMMER package against the Pfam database. The annotated GSS were also assigned with Gene Ontology annotation terms and integrated with 228 curated plant metabolic pathways from the Arabidopsis Information Resource (TAIR) knowledge base. The UniProtKB-Swiss-Prot ENZYME database was used to assign putative enzymatic function to each GSS. Each GSS was also analyzed with the Tandem Repeat Finder (TRF) program in order to identify potential SSRs for molecular marker discovery. The raw sequence data, processed annotation, and SSR results were stored in relational tables designed in key-value pair fashion using a PostgreSQL relational database management system. The biological knowledge derived from the sequence data and processed results are represented as views or materialized views in the relational database management system. All materialized views are indexed for quick data access and retrieval. Data processing and analysis pipelines were implemented using the Perl programming language. The web interface was implemented in JavaScript and Perl CGI running on an Apache web server. The CPU intensive data processing and analysis pipelines were run on a computer cluster of more than 30 dual-processor Apple XServes. A job management system called Vela was created as a robust way to submit large numbers of jobs to the Portable Batch System (PBS).

CONCLUSION

CGKB is an integrated and annotated resource for cowpea GSS with features of homology-based and HMM-based annotations, enzyme and pathway annotations, GO term annotation, toolkits, and a large number of other facilities to perform complex queries. The cowpea GSS, chloroplast sequences, mitochondrial sequences, retroelements, and SSR sequences are available as FASTA formatted files and downloadable at CGKB. This database and web interface are publicly accessible at http://cowpeagenomics.med.virginia.edu/CGKB/.

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2045/1868039/93a505a51d4d/1471-2105-8-129-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2045/1868039/87c4581815c4/1471-2105-8-129-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2045/1868039/42605f59e6d2/1471-2105-8-129-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2045/1868039/bfcf7ed35b94/1471-2105-8-129-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2045/1868039/93a505a51d4d/1471-2105-8-129-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2045/1868039/87c4581815c4/1471-2105-8-129-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2045/1868039/42605f59e6d2/1471-2105-8-129-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2045/1868039/bfcf7ed35b94/1471-2105-8-129-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2045/1868039/93a505a51d4d/1471-2105-8-129-4.jpg
摘要

背景

豇豆[Vigna unguiculata (L.) Walp.]是半干旱热带地区最重要的食用豆类和饲料豆类之一,因其具有耐旱能力且能在贫瘠土壤上生长。它主要由发展中国家的贫困农民种植,80%的产量来自热带西非和中非的干燥稀树草原。豇豆在很大程度上是一种未得到充分开发的作物,可用于应用植物育种的基因组信息相对较少。由英国慈善组织柯克豪斯信托基金资助的豇豆基因组计划(CGI)的目标是利用现代分子遗传工具进行基因发现和豇豆改良。该计划的一个方面是对使用甲基化过滤技术回收的豇豆基因组富含基因区域(称为基因空间)进行测序,并对序列数据进行注释和分析。

描述

CGKB,即豇豆基因空间/基因组知识库,是在CGI项目下开发的注释知识库。该数据库基于通过对基因组DNA进行甲基化过滤分离得到的298,848条豇豆基因空间序列(GSS)的信息。CGKB由三个知识库组成:GSS注释和比较基因组学知识库、GSS酶和代谢途径知识库以及用于分子标记发现的GSS简单序列重复(SSR)知识库。对GSS的注释采用基于同源性的方法,主要使用BLASTX比对四个公共的FASTA格式蛋白质数据库(NCBI GenBank蛋白质数据库、UniProtKB - Swiss - Prot、UniprotKB - PIR(蛋白质信息资源库)和UniProtKB - TrEMBL)。通过将豇豆GSS与来自拟南芥、水稻、蒺藜苜蓿和毛果杨的四个植物蛋白质组进行BLASTX搜索来进行比较基因组分析。使用基于隐马尔可夫模型(HMM)的Genscan基因预测程序预测每个豇豆GSS上可能的外显子和内含子,并使用HMMER软件包针对Pfam数据库分析注释GSS上的潜在结构域。注释后的GSS还被赋予了基因本体注释术语,并与来自拟南芥信息资源库(TAIR)知识库的228条经过整理的植物代谢途径进行整合。使用UniProtKB - Swiss - Prot ENZYME数据库为每个GSS赋予推定的酶功能。还使用串联重复序列查找器(TRF)程序对每个GSS进行分析,以识别用于分子标记发现的潜在SSR。原始序列数据、处理后的注释和SSR结果存储在使用PostgreSQL关系数据库管理系统以键值对方式设计的关系表中。从序列数据和处理结果中获得的生物学知识在关系数据库管理系统中表示为视图或物化视图。所有物化视图都建立了索引以便快速数据访问和检索。数据处理和分析管道使用Perl编程语言实现。Web界面使用运行在Apache Web服务器上的JavaScript和Perl CGI实现。CPU密集型数据处理和分析管道在由30多个双处理器苹果XServe组成的计算机集群上运行。创建了一个名为Vela的作业管理系统,作为向便携式批处理系统(PBS)提交大量作业的可靠方式。

结论

CGKB是一个针对豇豆GSS的集成且经过注释的资源,具有基于同源性和基于HMM的注释、酶和途径注释、GO术语注释、工具包以及大量用于执行复杂查询的其他功能。豇豆GSS、叶绿体序列线粒体序列、反转录元件和SSR序列以FASTA格式文件提供,可在CGKB上下载。该数据库和Web界面可通过http://cowpeagenomics.med.virginia.edu/CGKB/公开访问。

相似文献

1
CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L.) methylation filtered genomic genespace sequences.CGKB:豇豆(Vigna unguiculata L.)甲基化过滤基因组基因空间序列的注释知识库。
BMC Bioinformatics. 2007 Apr 19;8:129. doi: 10.1186/1471-2105-8-129.
2
Sequencing and analysis of the gene-rich space of cowpea.豇豆富含基因区域的测序与分析。
BMC Genomics. 2008 Feb 27;9:103. doi: 10.1186/1471-2164-9-103.
3
TOBFAC: the database of tobacco transcription factors.TOBFAC:烟草转录因子数据库。
BMC Bioinformatics. 2008 Jan 25;9:53. doi: 10.1186/1471-2105-9-53.
4
De novo transcriptomic analysis of cowpea (Vigna unguiculata L. Walp.) for genic SSR marker development.豇豆(Vigna unguiculata L. Walp.)基因SSR标记开发的从头转录组分析
BMC Genet. 2017 Jul 11;18(1):65. doi: 10.1186/s12863-017-0531-5.
5
Development of unigene-derived SSR markers in cowpea (Vigna unguiculata) and their transferability to other Vigna species.在豇豆(Vigna unguiculata)中开发基于基因的 SSR 标记及其在其他 Vigna 物种中的可转移性。
Genome. 2010 Jul;53(7):508-23. doi: 10.1139/g10-028.
6
ESTuber db: an online database for Tuber borchii EST sequences.ESTuber数据库:一个用于意大利白块菌EST序列的在线数据库。
BMC Bioinformatics. 2007 Mar 8;8 Suppl 1(Suppl 1):S13. doi: 10.1186/1471-2105-8-S1-S13.
7
annot8r: GO, EC and KEGG annotation of EST datasets.annot8r:EST数据集的基因本体论(GO)、酶委员会编号(EC)和京都基因与基因组百科全书(KEGG)注释
BMC Bioinformatics. 2008 Apr 9;9:180. doi: 10.1186/1471-2105-9-180.
8
LegumeSSRdb: A Comprehensive Microsatellite Marker Database of Legumes for Germplasm Characterization and Crop Improvement.豆科 SSR 数据库:用于种质特征描述和作物改良的豆科综合微卫星标记数据库。
Int J Mol Sci. 2021 Oct 21;22(21):11350. doi: 10.3390/ijms222111350.
9
PeanutDB: an integrated bioinformatics web portal for Arachis hypogaea transcriptomics.花生数据库:一个用于花生转录组学的综合生物信息学网络平台。
BMC Plant Biol. 2012 Jun 19;12:94. doi: 10.1186/1471-2229-12-94.
10
Brassica ASTRA: an integrated database for Brassica genomic research.芸苔属植物ASTRA:一个用于芸苔属植物基因组研究的综合数据库。
Nucleic Acids Res. 2005 Jan 1;33(Database issue):D656-9. doi: 10.1093/nar/gki036.

引用本文的文献

1
The Kirkhouse Trust: Successes and Challenges in Twenty Years of Supporting Independent, Contemporary Grain Legume Breeding Projects in India and African Countries.柯克豪斯信托基金:在支持印度和非洲国家独立当代谷物豆类育种项目的二十年中的成功与挑战
Plants (Basel). 2024 Jul 1;13(13):1818. doi: 10.3390/plants13131818.
2
Pyramiding aphid resistance genes into the elite cowpea variety, Zaayura, using marker-assisted backcrossing.利用标记辅助回交法将蚜虫抗性基因导入优良豇豆品种Zaayura中。
Heliyon. 2024 May 25;10(11):e31976. doi: 10.1016/j.heliyon.2024.e31976. eCollection 2024 Jun 15.
3
Breeding of Vegetable Cowpea for Nutrition and Climate Resilience in Sub-Saharan Africa: Progress, Opportunities, and Challenges.

本文引用的文献

1
The genome of black cottonwood, Populus trichocarpa (Torr. & Gray).黑杨(毛果杨,Populus trichocarpa (Torr. & Gray))的基因组。
Science. 2006 Sep 15;313(5793):1596-604. doi: 10.1126/science.1128691.
2
Differential methylation of genes and repeats in land plants.陆地植物中基因和重复序列的差异甲基化
Genome Res. 2005 Oct;15(10):1431-40. doi: 10.1101/gr.4100405.
3
DNA methylation and epigenetics.DNA甲基化与表观遗传学。
撒哈拉以南非洲地区用于营养与气候适应的豇豆育种:进展、机遇与挑战
Plants (Basel). 2022 Jun 15;11(12):1583. doi: 10.3390/plants11121583.
4
Genomics, genetics and breeding of tropical legumes for better livelihoods of smallholder farmers.热带豆类的基因组学、遗传学与育种,助力小农户改善生计
Plant Breed. 2019 Aug;138(4):487-499. doi: 10.1111/pbr.12554. Epub 2018 Apr 17.
5
The genome assembly of asparagus bean, Vigna unguiculata ssp. sesquipedialis.菜豆基因组组装,豇豆亚种 sesquipedalis。
Sci Data. 2019 Jul 17;6(1):124. doi: 10.1038/s41597-019-0130-6.
6
Salinity stress response and 'omics' approaches for improving salinity stress tolerance in major grain legumes.盐胁迫响应及“组学”方法在提高主要粮食豆类耐盐性中的应用。
Plant Cell Rep. 2019 Mar;38(3):255-277. doi: 10.1007/s00299-019-02374-5. Epub 2019 Jan 12.
7
Molecular, Genetic and Agronomic Approaches to Utilizing Pulses as Cover Crops and Green Manure into Cropping Systems.将豆类用作覆盖作物和绿肥纳入种植系统的分子、遗传和农艺方法。
Int J Mol Sci. 2017 Jun 5;18(6):1202. doi: 10.3390/ijms18061202.
8
Genomic Tools in Cowpea Breeding Programs: Status and Perspectives.豇豆育种计划中的基因组工具:现状与展望
Front Plant Sci. 2016 Jun 3;7:757. doi: 10.3389/fpls.2016.00757. eCollection 2016.
9
Genomics-assisted breeding in four major pulse crops of developing countries: present status and prospects.发展中国家四种主要豆类作物的基因组学辅助育种:现状与前景
Theor Appl Genet. 2014 Jun;127(6):1263-91. doi: 10.1007/s00122-014-2301-3. Epub 2014 Apr 8.
10
Global changes in gene expression during compatible and incompatible interactions of cowpea (Vigna unguiculata L.) with the root parasitic angiosperm Striga gesnerioides.在豇豆(Vigna unguiculata L.)与根寄生被子植物独脚金(Striga gesnerioides)的亲和和非亲和互作过程中,基因表达的全球变化。
BMC Genomics. 2012 Aug 17;13:402. doi: 10.1186/1471-2164-13-402.
Annu Rev Plant Biol. 2004;55:41-68. doi: 10.1146/annurev.arplant.55.031903.141641.
4
Sorghum genome sequencing by methylation filtration.通过甲基化过滤进行高粱基因组测序。
PLoS Biol. 2005 Jan;3(1):e13. doi: 10.1371/journal.pbio.0030013. Epub 2005 Jan 4.
5
The value-added genome: building and maintaining genomic cytosine methylation landscapes.增值基因组:构建和维持基因组胞嘧啶甲基化图谱
Curr Opin Genet Dev. 2004 Dec;14(6):686-91. doi: 10.1016/j.gde.2004.09.009.
6
Enrichment of gene-coding sequences in maize by genome filtration.通过基因组过滤富集玉米中的基因编码序列。
Science. 2003 Dec 19;302(5653):2118-20. doi: 10.1126/science.1090047.
7
Maize genome sequencing by methylation filtration.通过甲基化过滤进行玉米基因组测序。
Science. 2003 Dec 19;302(5653):2115-7. doi: 10.1126/science.1091265.
8
The Pfam protein families database.Pfam蛋白质家族数据库。
Nucleic Acids Res. 2004 Jan 1;32(Database issue):D138-41. doi: 10.1093/nar/gkh121.
9
Methylation of a euchromatin-heterochromatin transition region in Arabidopsis thaliana chromosome 5 left arm.拟南芥5号染色体左臂上常染色质-异染色质转变区域的甲基化
Chromosome Res. 2002;10(6):455-66. doi: 10.1023/a:1020936229771.
10
Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome.基因和反转录转座子的差异甲基化有助于玉米基因组的鸟枪法测序。
Nat Genet. 1999 Nov;23(3):305-8. doi: 10.1038/15479.