Interdisciplinary Centre for Plant Genomics and Department of Plant Molecular Biology, University of Delhi, South Campus, New Delhi, India.
Proteomics Laboratory, Division of Plant Biotechnology, Sher-e-Kashmir University of Agricultural Sciences & Technology of Kashmir, Shalimar, Srinagar, Jammu & Kashmir, India.
Sci Rep. 2020 Sep 15;10(1):15116. doi: 10.1038/s41598-020-70713-8.
Nuclear proteins are primarily regulatory factors governing gene expression. Multiple factors determine the localization of a protein in the nucleus. An upright identification of nuclear proteins is way far from accuracy. We have attempted to combine information from subcellular prediction tools, experimental evidence, and nuclear proteome data to identify a reliable list of seed-expressed nuclear proteins in rice. Depending upon the number of prediction tools calling a protein nuclear, we could sort 19,441 seed expressed proteins into five categories. Of which, half of the seed-expressed proteins were called nuclear by at least one out of four prediction tools. Further, gene ontology (GO) enrichment and transcription factor composition analysis showed that 6116 seed-expressed proteins could be called nuclear with a greater assertion. Localization evidence from experimental data was available for 1360 proteins. Their analysis showed that a 92.04% accuracy of a nuclear call is valid for proteins predicted nuclear by at least three tools. Distribution of nuclear localization signals and nuclear export signals showed that the majority of category four members were nuclear resident proteins, whereas other categories have a low fraction of nuclear resident proteins and significantly higher constitution of shuttling proteins. We compiled all the above information for the seed-expressed genes in the form of a searchable database named Rice Seed Nuclear Protein DataBase (RSNP-DB) https://pmb.du.ac.in/rsnpdb . This information will be useful for comprehending the role of seed nuclear proteome in rice.
核蛋白主要是调节基因表达的调控因子。多个因素决定蛋白质在核内的定位。准确识别核蛋白的方法还远远不够。我们试图结合亚细胞预测工具、实验证据和核蛋白组数据,以鉴定水稻中可靠的种子表达核蛋白列表。根据调用蛋白核的预测工具的数量,我们可以将 19441 个种子表达蛋白分为五类。其中,有一半的种子表达蛋白至少被四个预测工具中的一个称为核蛋白。此外,GO 富集和转录因子组成分析表明,6116 个种子表达蛋白可以更有把握地称为核蛋白。有 1360 个蛋白有实验数据的定位证据。对它们的分析表明,对于至少被三个工具预测为核蛋白的蛋白,核调用的准确率为 92.04%。核定位信号和核输出信号的分布表明,大多数第四类成员是核驻留蛋白,而其他类别则具有较低比例的核驻留蛋白和显著更高比例的穿梭蛋白。我们以可搜索数据库的形式(Rice Seed Nuclear Protein DataBase, RSNP-DB),将上述所有信息整合到种子表达基因中,网址为 https://pmb.du.ac.in/rsnpdb。这些信息将有助于理解种子核蛋白组在水稻中的作用。