Mishra Shital Kumar, Wang Han
Center for Circadian Clocks, Soochow University, Suzhou 215123, China.
School of Biology & Basic Medical Sciences, Medical College, Soochow University, Suzhou 215123, China.
Biology (Basel). 2021 Apr 26;10(5):371. doi: 10.3390/biology10050371.
Recent studies have demonstrated that numerous long noncoding RNAs (ncRNAs having more than 200 nucleotide base pairs (lncRNAs)) actually encode functional micropeptides, which likely represents the next regulatory biology frontier. Thus, identification of coding lncRNAs from ever-increasing lncRNA databases would be a bioinformatic challenge. Here we employed the Coding Potential Alignment Tool (CPAT), Coding Potential Calculator 2 (CPC2), LGC web server, Coding-Non-Coding Identifying Tool (CNIT), RNAsamba, and MicroPeptide identification tool (MiPepid) to analyze approximately 21,000 zebrafish lncRNAs and computationally to identify 2730-6676 zebrafish lncRNAs with high coding potentials, including 313 coding lncRNAs predicted by all the six bioinformatic tools. We also compared the sensitivity and specificity of these six bioinformatic tools for identifying lncRNAs with coding potentials and summarized their strengths and weaknesses. These predicted zebrafish coding lncRNAs set the stage for further experimental studies.
近期研究表明,众多长链非编码RNA(具有超过200个核苷酸碱基对的非编码RNA(lncRNA))实际上编码功能性微肽,这可能代表了下一个调控生物学前沿领域。因此,从不断增加的lncRNA数据库中识别编码lncRNA将是一项生物信息学挑战。在此,我们利用编码潜能比对工具(CPAT)、编码潜能计算器2(CPC2)、LGC网络服务器、编码-非编码识别工具(CNIT)、RNAsamba和微肽识别工具(MiPepid)来分析约21,000条斑马鱼lncRNA,并通过计算识别出2730-6676条具有高编码潜能的斑马鱼lncRNA,其中包括所有六种生物信息学工具预测的313条编码lncRNA。我们还比较了这六种生物信息学工具在识别具有编码潜能的lncRNA方面的敏感性和特异性,并总结了它们的优缺点。这些预测的斑马鱼编码lncRNA为进一步的实验研究奠定了基础。