一种用于识别ENCODE区域中假基因的计算方法。

A computational approach for identifying pseudogenes in the ENCODE regions.

作者信息

Zheng Deyou, Gerstein Mark B

机构信息

Department of Molecular Biophysics and Biochemistry, Yale University, Whitney Avenue, New Haven, CT 06520, USA.

出版信息

Genome Biol. 2006;7 Suppl 1(Suppl 1):S13.1-10. doi: 10.1186/gb-2006-7-s1-s13. Epub 2006 Aug 7.

DOI:10.1186/gb-2006-7-s1-s13

PMID:16925835

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1810550/

Abstract

BACKGROUND

Pseudogenes are inheritable genetic elements showing sequence similarity to functional genes but with deleterious mutations. We describe a computational pipeline for identifying them, which in contrast to previous work explicitly uses intron-exon structure in parent genes to classify pseudogenes. We require alignments between duplicated pseudogenes and their parents to span intron-exon junctions, and this can be used to distinguish between true duplicated and processed pseudogenes (with insertions).

RESULTS

Applying our approach to the ENCODE regions, we identify about 160 pseudogenes, 10% of which have clear 'intron-exon' structure and are thus likely generated from recent duplications.

CONCLUSION

Detailed examination of our results and comparison of our annotation with the GENCODE reference annotation demonstrate that our computation pipeline provides a good balance between identifying all pseudogenes and delineating the precise structure of duplicated genes.

摘要

背景

假基因是可遗传的遗传元件，与功能基因具有序列相似性，但存在有害突变。我们描述了一种用于识别假基因的计算流程，与之前的工作不同，该流程明确使用亲本基因中的内含子-外显子结构对假基因进行分类。我们要求重复的假基因与其亲本之间的比对跨越内含子-外显子连接点，这可用于区分真正的重复假基因和加工假基因（有插入）。

结果

将我们的方法应用于ENCODE区域，我们识别出约160个假基因，其中10%具有清晰的“内含子-外显子”结构，因此可能是近期复制产生的。

结论

对我们的结果进行详细检查，并将我们的注释与GENCODE参考注释进行比较，结果表明我们的计算流程在识别所有假基因和描绘重复基因的精确结构之间取得了良好的平衡。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ee75/1810550/9a770574b838/gb-2006-7-s1-s13-1.jpg

相似文献

A computational approach for identifying pseudogenes in the ENCODE regions.一种用于识别ENCODE区域中假基因的计算方法。

Genome Biol. 2006;7 Suppl 1(Suppl 1):S13.1-10. doi: 10.1186/gb-2006-7-s1-s13. Epub 2006 Aug 7.

Identification of Pseudogenes in Brachypodium distachyon Chromosomes.

Methods Mol Biol. 2018;1667:149-171. doi: 10.1007/978-1-4939-7278-4_12.

GENCODE: producing a reference annotation for ENCODE.GENCODE：为ENCODE生成参考注释。

Genome Biol. 2006;7 Suppl 1(Suppl 1):S4.1-9. doi: 10.1186/gb-2006-7-s1-s4. Epub 2006 Aug 7.

GENCODE pseudogenes.GENCODE假基因

Methods Mol Biol. 2014;1167:129-55. doi: 10.1007/978-1-4939-0835-6_10.

Automatic annotation of eukaryotic genes, pseudogenes and promoters.真核基因、假基因和启动子的自动注释

Genome Biol. 2006;7 Suppl 1(Suppl 1):S10.1-12. doi: 10.1186/gb-2006-7-s1-s10. Epub 2006 Aug 7.

GENCODE Pseudogenes.GENCODE 假基因。

Methods Mol Biol. 2021;2324:67-82. doi: 10.1007/978-1-0716-1503-4_5.

PseudoPipe: an automated pseudogene identification pipeline.伪基因管道（PseudoPipe）：一种自动化的伪基因识别管道。

Bioinformatics. 2006 Jun 15;22(12):1437-9. doi: 10.1093/bioinformatics/btl116. Epub 2006 Mar 30.

Computational methods for pseudogene annotation based on sequence homology.基于序列同源性的假基因注释计算方法。

Methods Mol Biol. 2014;1167:27-39. doi: 10.1007/978-1-4939-0835-6_3.

Systematic identification of pseudogenes through whole genome expression evidence profiling.通过全基因组表达证据分析系统鉴定假基因。

Nucleic Acids Res. 2006;34(16):4477-85. doi: 10.1093/nar/gkl591. Epub 2006 Aug 31.

GENCODE: the reference human genome annotation for The ENCODE Project.GENCODE：ENCODE 项目的人类参考基因组注释。

Genome Res. 2012 Sep;22(9):1760-74. doi: 10.1101/gr.135350.111.

引用本文的文献

The reconstruction of evolutionary dynamics of processed pseudogenes indicates deep silencing of "retrobiome" in naked mole rat.加工假基因进化动力学的重建表明裸鼹鼠“返生生物组”的深度沉默。

Proc Natl Acad Sci U S A. 2024 Nov 5;121(45):e2313581121. doi: 10.1073/pnas.2313581121. Epub 2024 Oct 28.

RetroScan: An Easy-to-Use Pipeline for Retrocopy Annotation and Visualization.RetroScan：一种用于反转录拷贝注释和可视化的易于使用的流程。

Front Genet. 2021 Aug 16;12:719204. doi: 10.3389/fgene.2021.719204. eCollection 2021.

Expression, Interaction, and Role of Pseudogene Adh6-ps1 in Cancer Phenotypes.假基因Adh6-ps1在癌症表型中的表达、相互作用及作用

Bioinform Biol Insights. 2021 Aug 14;15:11779322211040591. doi: 10.1177/11779322211040591. eCollection 2021.

An Overview of Duplicated Gene Detection Methods: Why the Duplication Mechanism Has to Be Accounted for in Their Choice.重复基因检测方法概述：选择重复基因检测方法时为何必须考虑重复机制。

Genes (Basel). 2020 Sep 4;11(9):1046. doi: 10.3390/genes11091046.

Re-recognition of pseudogenes: From molecular to clinical applications.假基因的再识别：从分子到临床应用。

Theranostics. 2020 Jan 1;10(4):1479-1499. doi: 10.7150/thno.40659. eCollection 2020.

PseudoFuN: Deriving functional potentials of pseudogenes from integrative relationships with genes and microRNAs across 32 cancers.伪 FuN：从 32 种癌症中与基因和 microRNAs 的整合关系中推导出假基因的功能潜力。

Gigascience. 2019 May 1;8(5). doi: 10.1093/gigascience/giz046.

Current Research on Non-Coding Ribonucleic Acid (RNA).非编码核糖核酸（RNA）的当前研究

Genes (Basel). 2017 Dec 5;8(12):366. doi: 10.3390/genes8120366.

Pseudogenes and Their Genome-Wide Prediction in Plants.植物中的假基因及其全基因组预测

Int J Mol Sci. 2016 Nov 28;17(12):1991. doi: 10.3390/ijms17121991.

Expressed pseudogenes in the transcriptional landscape of human cancers.人类癌症转录组中的表达假基因。

Cell. 2012 Jun 22;149(7):1622-34. doi: 10.1016/j.cell.2012.04.041.

The importance of identifying alternative splicing in vertebrate genome annotation.鉴定脊椎动物基因组注释中选择性剪接的重要性。

Database (Oxford). 2012 Mar 20;2012:bas014. doi: 10.1093/database/bas014. Print 2012.

本文引用的文献

GENCODE: producing a reference annotation for ENCODE.GENCODE：为ENCODE生成参考注释。

Genome Biol. 2006;7 Suppl 1(Suppl 1):S4.1-9. doi: 10.1186/gb-2006-7-s1-s4. Epub 2006 Aug 7.

EGASP: the human ENCODE Genome Annotation Assessment Project.EGASP：人类ENCODE基因组注释评估项目。

Genome Biol. 2006;7 Suppl 1(Suppl 1):S2.1-31. doi: 10.1186/gb-2006-7-s1-s2. Epub 2006 Aug 7.

Genome-wide identification of pseudogenes capable of disease-causing gene conversion.全基因组范围内对能够进行致病基因转换的假基因的鉴定。

Hum Mutat. 2006 Jun;27(6):545-52. doi: 10.1002/humu.20335.

Iterative gene prediction and pseudogene removal improves genome annotation.迭代基因预测和假基因去除可改善基因组注释。

Genome Res. 2006 May;16(5):678-85. doi: 10.1101/gr.4766206.

Reference based annotation with GeneMapper.使用基因分型仪进行基于参考的注释。

Genome Biol. 2006;7(4):R29. doi: 10.1186/gb-2006-7-4-r29. Epub 2006 Apr 5.

PseudoPipe: an automated pseudogene identification pipeline.伪基因管道（PseudoPipe）：一种自动化的伪基因识别管道。

Bioinformatics. 2006 Jun 15;22(12):1437-9. doi: 10.1093/bioinformatics/btl116. Epub 2006 Mar 30.

Statistical alignment of retropseudogenes and their functional paralogs.

Mol Biol Evol. 2005 Dec;22(12):2457-71. doi: 10.1093/molbev/msi238. Epub 2005 Aug 17.

Integrated pseudogene annotation for human chromosome 22: evidence for transcription.人类22号染色体的综合假基因注释：转录证据

J Mol Biol. 2005 May 27;349(1):27-45. doi: 10.1016/j.jmb.2005.02.072. Epub 2005 Apr 2.

HOPPSIGEN: a database of human and mouse processed pseudogenes.HOPPSIGEN：人类和小鼠加工假基因数据库。

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D59-66. doi: 10.1093/nar/gki084.

NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.NCBI参考序列（RefSeq）：一个经过整理的基因组、转录本和蛋白质的非冗余序列数据库。

Nucleic Acids Res. 2005 Jan 1;33(Database issue):D501-4. doi: 10.1093/nar/gki025.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种用于识别ENCODE区域中假基因的计算方法。

A computational approach for identifying pseudogenes in the ENCODE regions.

作者信息

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献