Suppr超能文献

HapIso:一种从长单分子读数中准确重建单倍型特异性异构体的方法。

HapIso: An Accurate Method for the Haplotype- Specific Isoforms Reconstruction From Long Single-Molecule Reads.

作者信息

Mangul Serghei, Yang Taegyun Harry, Hormozdiari Farhad, Dainis Alexandra Marie, Tseng Elizabeth, Ashley Euan A, Zelikovsky Alex, Eskin Eleazar

出版信息

IEEE Trans Nanobioscience. 2017 Mar;16(2):108-115. doi: 10.1109/TNB.2017.2675981. Epub 2017 Mar 17.

Abstract

Sequencing of RNA provides the possibility to study an individual's transcriptome landscape and determine allelic expression ratios. Single-molecule protocols generate multi-kilobase reads longer than most transcripts, allowing sequencing of complete haplotype isoforms. This allows partitioning the reads into two parental haplotypes. While the read length of the single-molecule protocols is long, the relatively high error rate limits the ability to accurately detect the genetic variants and assemble them into the haplotype-specific isoforms. In this paper, we present Haplotype-specific Isoform reconstruction (HapIso), a method able to tolerate the relatively high error rate of the single-molecule platform and partition the isoform reads into the parental alleles. Phasing the reads according to the allele of origin allows our method to efficiently distinguish between the read errors and the true biological mutations. HapIso uses a k -means clustering algorithm aiming to group the reads into two meaningful clusters maximizing the similarity of the reads within the cluster and minimizing the similarity of the reads from different clusters. Each cluster corresponds to a parental haplotype. We used the family pedigree information to evaluate our approach. Experimental validation suggests that HapIso is able to tolerate the relatively high error rate and accurately partition the reads into the parental alleles of the isoform transcripts. We also applied HapIso to novel clinical single-molecule RNA-Seq data to estimate allele-specific expression of genes of interest. Our method was able to correct reads and determine Glu1883Lys point mutation of clinical significance validated by GeneDx HCM panel. Furthermore, our method is the first method able to reconstruct the haplotype-specific isoforms from long single-molecule reads.

摘要

RNA测序为研究个体转录组图谱和确定等位基因表达比率提供了可能。单分子技术方案能生成比大多数转录本更长的数千碱基长读段,从而实现完整单倍型异构体的测序。这使得可以将读段划分为两个亲本单倍型。虽然单分子技术方案的读段长度很长,但相对较高的错误率限制了准确检测遗传变异并将其组装成单倍型特异性异构体的能力。在本文中,我们提出了单倍型特异性异构体重建方法(HapIso),该方法能够容忍单分子平台相对较高的错误率,并将异构体读段划分为亲本等位基因。根据起源等位基因对读段进行定相,使我们的方法能够有效区分读段错误和真正的生物学突变。HapIso使用k均值聚类算法,旨在将读段分组为两个有意义的簇,使簇内读段的相似性最大化,不同簇间读段的相似性最小化。每个簇对应一个亲本单倍型。我们利用家系信息评估我们的方法。实验验证表明,HapIso能够容忍相对较高的错误率,并准确地将读段划分为异构体转录本的亲本等位基因。我们还将HapIso应用于新的临床单分子RNA测序数据,以估计感兴趣基因的等位基因特异性表达。我们的方法能够校正读段,并确定经GeneDx HCM检测板验证具有临床意义的Glu1883Lys点突变。此外,我们的方法是第一种能够从长单分子读段重建单倍型特异性异构体的方法。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验