Suppr超能文献

基于逆转录病毒在肿瘤中的插入分布确定常见插入位点。

Determining common insertion sites based on retroviral insertion distribution across tumors.

作者信息

Chen Feng, Li Zhoufang, Chen Yi-Ping Phoebe

机构信息

College of Information Science and Engineering, Henan University of Technology, Zhengzhou City, Henan Province 450001, China; Faculty of Science, Technology and Engineering, La Trobe University, Melbourne, Victoria 3086, Australia.

College of Information Science and Engineering, Henan University of Technology, Zhengzhou City, Henan Province 450001, China.

出版信息

Comput Biol Chem. 2014 Aug;51:83-92. doi: 10.1016/j.compbiolchem.2014.03.001. Epub 2014 Mar 12.

Abstract

A CIS (common insertion site) indicates a genome region that is hit more frequently by retroviral insertions than expected by chance. Such a region is strongly related to cancer gene loci, which leads to the detection of cancer genes. An algorithm for detecting CISs should satisfy the following: (1) it does not require any prior knowledge of underlying insertion distribution; (2) it can resolve the insertion biases caused by hotspots; (3) it can detect CISs of any biological width; (4) it can identify noises resulting from statistic mistakes and non-CIS insertions; and (5) it can identify the widths of CISs as accurately as possible. We develop a method to resolve these difficulties. We verify a region's significance from two perspectives: distribution width and distribution depth. The former indicates how many insertions in a region while the latter evaluates the insertion distribution across the tumors in a region. We compare our method with kernel density estimation and sliding window on the simulated data, showing that our method not only identifies cancer-related insertions effectively, but also filters noises correctly. The experiments on the real data show that taking insertion distribution into account can highlight significant CISs. We detect 53 novel CISs, some of which have been proven correct by the biological literature.

摘要

一个共同插入位点(CIS)指的是基因组中一个区域,该区域被逆转录病毒插入的频率高于随机预期。这样的一个区域与癌症基因位点密切相关,从而能实现癌症基因的检测。一种用于检测CIS的算法应满足以下几点:(1)它不需要关于潜在插入分布的任何先验知识;(2)它能够解决由热点导致的插入偏差;(3)它能够检测任何生物学宽度的CIS;(4)它能够识别由统计错误和非CIS插入产生的噪声;(5)它能够尽可能准确地识别CIS的宽度。我们开发了一种方法来解决这些难题。我们从两个角度验证一个区域的显著性:分布宽度和分布深度。前者表示一个区域内有多少插入,而后者评估一个区域内肿瘤间的插入分布。我们在模拟数据上把我们的方法与核密度估计和滑动窗口进行比较,结果表明我们的方法不仅能有效地识别与癌症相关的插入,还能正确地过滤噪声。在真实数据上的实验表明,考虑插入分布能够突出显著的CIS。我们检测到53个新的CIS,其中一些已被生物学文献证实是正确的。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验