Suppr超能文献

27455 个玉米全长 cDNA 的测序、作图和分析。

Sequencing, mapping, and analysis of 27,455 maize full-length cDNAs.

机构信息

BIO5 Institute, University of Arizona, Tucson, Arizona, USA.

出版信息

PLoS Genet. 2009 Nov;5(11):e1000740. doi: 10.1371/journal.pgen.1000740. Epub 2009 Nov 20.

Abstract

Full-length cDNA (FLcDNA) sequencing establishes the precise primary structure of individual gene transcripts. From two libraries representing 27 B73 tissues and abiotic stress treatments, 27,455 high-quality FLcDNAs were sequenced. The average transcript length was 1.44 kb including 218 bases and 321 bases of 5' and 3' UTR, respectively, with 8.6% of the FLcDNAs encoding predicted proteins of fewer than 100 amino acids. Approximately 94% of the FLcDNAs were stringently mapped to the maize genome. Although nearly two-thirds of this genome is composed of transposable elements (TEs), only 5.6% of the FLcDNAs contained TE sequences in coding or UTR regions. Approximately 7.2% of the FLcDNAs are putative transcription factors, suggesting that rare transcripts are well-enriched in our FLcDNA set. Protein similarity searching identified 1,737 maize transcripts not present in rice, sorghum, Arabidopsis, or poplar annotated genes. A strict FLcDNA assembly generated 24,467 non-redundant sequences, of which 88% have non-maize protein matches. The FLcDNAs were also assembled with 41,759 FLcDNAs in GenBank from other projects, where semi-strict parameters were used to identify 13,368 potentially unique non-redundant sequences from this project. The libraries, ESTs, and FLcDNA sequences produced from this project are publicly available. The annotated EST and FLcDNA assemblies are available through the maize FLcDNA web resource (www.maizecdna.org).

摘要

全长 cDNA (FLcDNA) 测序确定了个体基因转录物的精确一级结构。从代表 27 个 B73 组织和非生物胁迫处理的两个文库中,共测序了 27455 个高质量的 FLcDNA。平均转录物长度为 1.44kb,分别包含 218 个碱基和 321 个 5'和 3'UTR,其中 8.6%的 FLcDNA 编码少于 100 个氨基酸的预测蛋白质。大约 94%的 FLcDNA 严格映射到玉米基因组上。尽管该基因组的近三分之二由转座元件 (TEs) 组成,但只有 5.6%的 FLcDNA 在编码区或 UTR 区含有 TE 序列。大约 7.2%的 FLcDNA 是假定的转录因子,这表明稀有转录物在我们的 FLcDNA 集中得到了很好的富集。蛋白质相似性搜索鉴定出 1737 个玉米转录本,在水稻、高粱、拟南芥或杨树注释基因中不存在。严格的 FLcDNA 组装生成了 24467 个非冗余序列,其中 88%有非玉米蛋白匹配。FLcDNA 还与来自其他项目的 GenBank 中的 41759 个 FLcDNA 组装在一起,其中使用半严格参数从该项目中鉴定出 13368 个潜在独特的非冗余序列。该项目产生的文库、EST 和 FLcDNA 序列均可公开获取。注释的 EST 和 FLcDNA 组装可通过玉米 FLcDNA 网络资源(www.maizecdna.org)获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e76b/2774520/a4250df86813/pgen.1000740.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验