Suppr超能文献

一种用于识别ENCODE区域中假基因的计算方法。

A computational approach for identifying pseudogenes in the ENCODE regions.

作者信息

Zheng Deyou, Gerstein Mark B

机构信息

Department of Molecular Biophysics and Biochemistry, Yale University, Whitney Avenue, New Haven, CT 06520, USA.

出版信息

Genome Biol. 2006;7 Suppl 1(Suppl 1):S13.1-10. doi: 10.1186/gb-2006-7-s1-s13. Epub 2006 Aug 7.

Abstract

BACKGROUND

Pseudogenes are inheritable genetic elements showing sequence similarity to functional genes but with deleterious mutations. We describe a computational pipeline for identifying them, which in contrast to previous work explicitly uses intron-exon structure in parent genes to classify pseudogenes. We require alignments between duplicated pseudogenes and their parents to span intron-exon junctions, and this can be used to distinguish between true duplicated and processed pseudogenes (with insertions).

RESULTS

Applying our approach to the ENCODE regions, we identify about 160 pseudogenes, 10% of which have clear 'intron-exon' structure and are thus likely generated from recent duplications.

CONCLUSION

Detailed examination of our results and comparison of our annotation with the GENCODE reference annotation demonstrate that our computation pipeline provides a good balance between identifying all pseudogenes and delineating the precise structure of duplicated genes.

摘要

背景

假基因是可遗传的遗传元件,与功能基因具有序列相似性,但存在有害突变。我们描述了一种用于识别假基因的计算流程,与之前的工作不同,该流程明确使用亲本基因中的内含子-外显子结构对假基因进行分类。我们要求重复的假基因与其亲本之间的比对跨越内含子-外显子连接点,这可用于区分真正的重复假基因和加工假基因(有插入)。

结果

将我们的方法应用于ENCODE区域,我们识别出约160个假基因,其中10%具有清晰的“内含子-外显子”结构,因此可能是近期复制产生的。

结论

对我们的结果进行详细检查,并将我们的注释与GENCODE参考注释进行比较,结果表明我们的计算流程在识别所有假基因和描绘重复基因的精确结构之间取得了良好的平衡。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee75/1810550/9a770574b838/gb-2006-7-s1-s13-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验