Moreira Daniel A, Lamarca Alessandra P, Soares Rafael Ferreira, Coelho Ana M A, Furtado Carolina, Scherer Nicole M, Moreira Miguel A M, Seuánez Hector N, Boroni Mariana
Laboratory of Bioinformatics and Computational Biology, Division of Experimental and Translational Research, Brazilian National Cancer Institute (INCA), Rio de Janeiro, Brazil.
Laboratory of Bioinformatics and Molecular Evolution, Department of Genetics, Institute of Biology, Federal University of Rio de Janeiro (UFRJ), Rio de Janeiro, Brazil.
Front Genet. 2020 Jul 31;11:831. doi: 10.3389/fgene.2020.00831. eCollection 2020.
The southern muriqui () is the largest neotropical primate. This species is endemic to Brazil and is currently critically endangered due to its habitat destruction. The genetic basis underlying adaptive traits of New World monkeys has been a subject of interest to several investigators, with significant concern about genes related to the immune system. In the absence of a reference genome, RNA-seq and transcriptome assembly have proved to be valuable genetic procedures for accessing gene sequences and testing evolutionary hypotheses. We present here a first report on the sequencing, assembly, annotation and adaptive selection analysis for thousands of transcripts of from two different samples, corresponding to 13 different blood cells and fibroblasts. We assembled 284,283 transcripts with N50 of 2,940 bp, with a high rate of complete transcripts, with a median high scoring pair coverage of 88.2%, including low expressed transcripts, accounting for 72.3% of complete BUSCOs. We could predict and extract 81,400 coding sequences with 79.8% of significant BLAST hit against the Euarchontoglires SwissProt dataset. Of these 64,929 sequences, 34,084 were considered homologous to Supraprimate proteins, and of the remaining sequences (30,845), 94% were associated with a protein domain or a KEGG Orthology group, indicating potentially novel or specific protein-coding genes of . We use the predicted protein sequences to perform a comparative analysis with 10 other primates. This analysis revealed, for the first time in an Atelid species, an expansion of , extending this knowledge to all NWM families. Using a branch-site model, we searched for evidence of positive selection in 4,533 orthologous sets. This evolutionary analysis revealed 132 amino acid sites in 30 genes potentially evolving under positive selection, shedding light on primate genome evolution. These genes belonged to a wide variety of categories, including those encoding the innate immune system proteins (, , and ) among others related to the immune response. This work generated a set of thousands of complete sequences that can be used in other studies on molecular evolution and may help to unveil the evolution of primate genes. Still, further functional studies are required to provide an understanding of the underlying evolutionary forces modeling the primate genome.
南方绒毛蛛猴()是新热带界最大的灵长类动物。该物种为巴西特有,目前因栖息地遭到破坏而处于极度濒危状态。新大陆猴适应性性状的遗传基础一直是众多研究人员感兴趣的课题,其中对与免疫系统相关的基因尤为关注。在缺乏参考基因组的情况下,RNA测序和转录组组装已被证明是获取基因序列和检验进化假说的有价值的遗传学方法。我们在此首次报告了对来自两个不同样本(对应13种不同血细胞和成纤维细胞)的数千个转录本进行测序、组装、注释和适应性选择分析的结果。我们组装了284,283个转录本,N50为2940 bp,完整转录本比例高,高得分匹配对的中位数覆盖率为88.2%,包括低表达转录本,占完整BUSCOs的72.3%。我们能够预测并提取81,400个编码序列,其中79.8%与真灵长大目瑞士蛋白质数据集有显著的BLAST匹配。在这些64,929个序列中,34,084个被认为与上灵长类蛋白质同源,在其余序列(30,845个)中,94%与一个蛋白质结构域或一个KEGG直系同源组相关,表明可能是南方绒毛蛛猴新的或特有的蛋白质编码基因。我们使用预测的蛋白质序列与其他10种灵长类动物进行比较分析。这一分析首次在一种蛛猴科物种中揭示了的扩张,将这一认识扩展到了所有新大陆猴科动物。使用分支位点模型,我们在4533个直系同源组中寻找正选择的证据。这一进化分析揭示了30个基因中的132个氨基酸位点可能在正选择下进化,为灵长类基因组进化提供了线索。这些基因属于各种各样的类别,包括那些编码先天免疫系统蛋白质(、和)以及其他与免疫反应相关的基因。这项工作产生了一组数千个完整序列,可用于其他分子进化研究,并可能有助于揭示灵长类基因的进化。不过,仍需要进一步的功能研究来理解塑造灵长类基因组的潜在进化力量。