Cornman Robert S, Robertson Laura S, Galbraith Heather, Blakeslee Carrie
Leetown Science Center, United States Geological Survey, Kearneysville, West Virginia, United States of America.
Northern Appalachian Research Branch (Leetown Science Center), United States Geological Survey, Wellsboro, Pennsylvania, United States of America.
PLoS One. 2014 Nov 6;9(11):e112420. doi: 10.1371/journal.pone.0112420. eCollection 2014.
Mussels are useful indicator species of environmental stress and degradation, and the global decline in freshwater mussel diversity and abundance is of conservation concern. Elliptio complanata is a common freshwater mussel of eastern North America that can serve both as an indicator and as an experimental model for understanding mussel physiology and genetics. To support genetic components of these research goals, we assembled transcriptome contigs from Illumina paired-end reads. Despite efforts to collapse similar contigs, the final assembly was in excess of 136,000 contigs with an N50 of 982 bp. Even so, comparisons to the CEGMA database of conserved eukaryotic genes indicated that ∼ 20% of genes remain unrepresented. However, numerous candidate stress-response genes were present, and we identified lineage-specific patterns of diversification among molluscs for cytochrome P450 detoxification genes and two saccharide-modifying enzymes: 1,3 beta-galactosyltransferase and fucosyltransferase. Less than a quarter of contigs had protein-level similarity based on modest BLAST and Hmmer3 statistical thresholds. These results add comparative genomic resources for molluscs and suggest a wealth of novel proteins and noncoding transcripts.
贻贝是环境压力和退化的有用指示物种,全球淡水贻贝多样性和丰度的下降引发了保护方面的关注。椭圆贻贝是北美东部一种常见的淡水贻贝,既可以作为指示物种,也可以作为理解贻贝生理学和遗传学的实验模型。为了支持这些研究目标的遗传部分,我们从Illumina双末端测序读段中组装了转录组重叠群。尽管努力合并相似的重叠群,但最终组装得到了超过136,000个重叠群,N50为982 bp。即便如此,与保守真核基因的CEGMA数据库进行比较表明,仍有约20%的基因未得到体现。然而,存在大量候选应激反应基因,并且我们确定了软体动物中细胞色素P450解毒基因以及两种糖类修饰酶(1,3β-半乳糖基转移酶和岩藻糖基转移酶)的谱系特异性多样化模式。基于适度的BLAST和Hmmer3统计阈值,不到四分之一的重叠群具有蛋白质水平的相似性。这些结果为软体动物增加了比较基因组资源,并提示存在大量新的蛋白质和非编码转录本。