Suppr超能文献

基于EST数据的印记和等位基因特异性表达的最大似然推断

Maximum likelihood inference of imprinting and allele-specific expression from EST data.

作者信息

Seoighe Cathal, Nembaware Victoria, Scheffler Konrad

机构信息

Computational Biology Group, Institute of Infectious Disease and Molecular Medicine, University of Cape Town, South Africa.

出版信息

Bioinformatics. 2006 Dec 15;22(24):3032-9. doi: 10.1093/bioinformatics/btl521. Epub 2006 Oct 11.

Abstract

MOTIVATION

In a diploid organism the proportion of transcripts that are produced from the two parental alleles can differ substantially due, for example to epigenetic modification that causes complete or partial silencing of one parental allele or to cis acting polymorphisms that affect transcriptional regulation. Counts of SNP alleles derived from EST sequences have been used to identify both novel candidates for genomic imprinting as well as examples of genes with allelic differences in expression.

RESULTS

We have developed a set of statistical models in a maximum likelihood framework that can make highly efficient use of public transcript data to identify genes with unequal representation of alternative alleles in cDNA libraries. We modelled both imprinting and allele-specific expression and applied the models to a large dataset of SNPs mapped to EST sequences. Using simulations, matched closely to real data, we demonstrate significantly improved performance over existing methods that have been applied to the same data. We further validated the power of this approach to detect imprinting using a set of known imprinted genes and inferred a set of candidate imprinted genes, several of which are in close proximity to known imprinted genes. We report evidence that there are undiscovered imprinted genes in known imprinted regions. Overall, more than half of the genes for which the most data are available show some evidence of allele-specific expression.

AVAILABILITY

Software is available from the authors on request.

摘要

动机

在二倍体生物中,两个亲本等位基因产生的转录本比例可能存在显著差异,例如,由于表观遗传修饰导致一个亲本等位基因完全或部分沉默,或者由于影响转录调控的顺式作用多态性。从EST序列衍生的SNP等位基因计数已用于识别基因组印记的新候选基因以及表达存在等位基因差异的基因实例。

结果

我们在最大似然框架下开发了一组统计模型,该模型可以高效利用公共转录本数据来识别cDNA文库中替代等位基因呈现不等比例的基因。我们对印记和等位基因特异性表达进行了建模,并将这些模型应用于映射到EST序列的大量SNP数据集。通过与实际数据紧密匹配的模拟,我们证明与应用于相同数据的现有方法相比,性能有显著提升。我们使用一组已知的印记基因进一步验证了该方法检测印记的能力,并推断出一组候选印记基因,其中一些与已知印记基因紧密相邻。我们报告了在已知印记区域存在未发现的印记基因的证据。总体而言,有超过一半的有大量数据可用的基因显示出一些等位基因特异性表达的证据。

可用性

可根据作者要求提供软件。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验