• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过肿瘤纯度反卷积准确估计基因组微卫星的长度分布。

Accurately estimating the length distributions of genomic micro-satellites by tumor purity deconvolution.

机构信息

School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, 710048, People's Republic of China.

Shaanxi Engineering Research Center of Medical and Health Big Data, School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, 710048, People's Republic of China.

出版信息

BMC Bioinformatics. 2020 Mar 11;21(Suppl 2):82. doi: 10.1186/s12859-020-3349-5.

DOI:10.1186/s12859-020-3349-5
PMID:32164528
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7069170/
Abstract

BACKGROUND

Genomic micro-satellites are the genomic regions that consist of short and repetitive DNA motifs. Estimating the length distribution and state of a micro-satellite region is an important computational step in cancer sequencing data pipelines, which is suggested to facilitate the downstream analysis and clinical decision supporting. Although several state-of-the-art approaches have been proposed to identify micro-satellite instability (MSI) events, they are limited in dealing with regions longer than one read length. Moreover, based on our best knowledge, all of these approaches imply a hypothesis that the tumor purity of the sequenced samples is sufficiently high, which is inconsistent with the reality, leading the inferred length distribution to dilute the data signal and introducing the false positive errors.

RESULTS

In this article, we proposed a computational approach, named ELMSI, which detected MSI events based on the next generation sequencing technology. ELMSI can estimate the specific length distributions and states of micro-satellite regions from a mixed tumor sample paired with a control one. It first estimated the purity of the tumor sample based on the read counts of the filtered SNVs loci. Then, the algorithm identified the length distributions and the states of short micro-satellites by adding the Maximum Likelihood Estimation (MLE) step to the existing algorithm. After that, ELMSI continued to infer the length distributions of long micro-satellites by incorporating a simplified Expectation Maximization (EM) algorithm with central limit theorem, and then used statistical tests to output the states of these micro-satellites. Based on our experimental results, ELMSI was able to handle micro-satellites with lengths ranging from shorter than one read length to 10kbps.

CONCLUSIONS

To verify the reliability of our algorithm, we first compared the ability of classifying the shorter micro-satellites from the mixed samples with the existing algorithm MSIsensor. Meanwhile, we varied the number of micro-satellite regions, the read length and the sequencing coverage to separately test the performance of ELMSI on estimating the longer ones from the mixed samples. ELMSI performed well on mixed samples, and thus ELMSI was of great value for improving the recognition effect of micro-satellite regions and supporting clinical decision supporting. The source codes have been uploaded and maintained at https://github.com/YixuanWang1120/ELMSI for academic use only.

摘要

背景

基因组微卫星是由短而重复的 DNA 基序组成的基因组区域。估计微卫星区域的长度分布和状态是癌症测序数据管道中的一个重要计算步骤,这有助于促进下游分析和临床决策支持。尽管已经提出了几种用于识别微卫星不稳定性 (MSI) 事件的最先进方法,但它们在处理长度超过一个读取长度的区域时存在局限性。此外,据我们所知,所有这些方法都假设测序样本的肿瘤纯度足够高,这与现实不符,导致推断的长度分布稀释了数据信号并引入了假阳性错误。

结果

在本文中,我们提出了一种计算方法,名为 ELMSI,它基于下一代测序技术检测 MSI 事件。ELMSI 可以从混合肿瘤样本与对照样本配对中估计微卫星区域的特定长度分布和状态。它首先根据过滤后的 SNV 位点的读取计数估计肿瘤样本的纯度。然后,该算法通过向现有算法添加最大似然估计 (MLE) 步骤来识别短微卫星的长度分布和状态。之后,ELMSI 通过将简化的期望最大化 (EM) 算法与中心极限定理结合使用,继续推断长微卫星的长度分布,然后使用统计检验输出这些微卫星的状态。根据我们的实验结果,ELMSI 能够处理长度从短于一个读取长度到 10kbps 的微卫星。

结论

为了验证我们算法的可靠性,我们首先将其区分混合样本中较短微卫星的能力与现有的 MSIsensor 算法进行了比较。同时,我们改变了微卫星区域的数量、读取长度和测序覆盖度,分别测试了 ELMSI 对混合样本中较长微卫星的估计性能。ELMSI 在混合样本中表现良好,因此对于提高微卫星区域的识别效果和支持临床决策支持具有重要价值。源代码已上传并维护在 https://github.com/YixuanWang1120/ELMSI 上,仅供学术使用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aec3/7069170/d9e793661a7d/12859_2020_3349_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aec3/7069170/d661caa91006/12859_2020_3349_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aec3/7069170/998a8a2d24a3/12859_2020_3349_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aec3/7069170/d9e793661a7d/12859_2020_3349_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aec3/7069170/d661caa91006/12859_2020_3349_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aec3/7069170/998a8a2d24a3/12859_2020_3349_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aec3/7069170/d9e793661a7d/12859_2020_3349_Fig3_HTML.jpg

相似文献

1
Accurately estimating the length distributions of genomic micro-satellites by tumor purity deconvolution.通过肿瘤纯度反卷积准确估计基因组微卫星的长度分布。
BMC Bioinformatics. 2020 Mar 11;21(Suppl 2):82. doi: 10.1186/s12859-020-3349-5.
2
A graph-based algorithm for estimating clonal haplotypes of tumor sample from sequencing data.基于图的算法,用于从测序数据估计肿瘤样本的克隆单倍型。
BMC Med Genomics. 2019 Jan 31;12(Suppl 1):27. doi: 10.1186/s12920-018-0457-4.
3
MSIsensor-ct: microsatellite instability detection using cfDNA sequencing data.MSIsensor-ct:使用 cfDNA 测序数据检测微卫星不稳定性。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbaa402.
4
Read-Split-Run: an improved bioinformatics pipeline for identification of genome-wide non-canonical spliced regions using RNA-Seq data.读取-分割-运行:一种利用RNA测序数据识别全基因组非经典剪接区域的改进型生物信息学流程。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):503. doi: 10.1186/s12864-016-2896-7.
5
Identifying micro-inversions using high-throughput sequencing reads.使用高通量测序读数识别微倒位。
BMC Genomics. 2016 Jan 11;17 Suppl 1(Suppl 1):4. doi: 10.1186/s12864-015-2305-7.
6
MSIsensor-pro: Fast, Accurate, and Matched-normal-sample-free Detection of Microsatellite Instability.MSIsensor-pro:快速、准确且无需匹配正常样本的微卫星不稳定性检测。
Genomics Proteomics Bioinformatics. 2020 Feb;18(1):65-71. doi: 10.1016/j.gpb.2020.02.001. Epub 2020 Mar 12.
7
Short tandem repeat number estimation from paired-end reads for multiple individuals by considering coalescent tree.通过考虑溯祖树从多个个体的双端读段估计短串联重复序列数量
BMC Genomics. 2016 Aug 31;17 Suppl 5(Suppl 5):494. doi: 10.1186/s12864-016-2821-0.
8
PhredEM: a phred-score-informed genotype-calling approach for next-generation sequencing studies.PhredEM:一种用于下一代测序研究的基于Phred分数的基因型分型方法。
Genet Epidemiol. 2017 Jul;41(5):375-387. doi: 10.1002/gepi.22048. Epub 2017 May 31.
9
Precise detection of de novo single nucleotide variants in human genomes.精准检测人类基因组中的新单核苷酸变异。
Proc Natl Acad Sci U S A. 2018 May 22;115(21):5516-5521. doi: 10.1073/pnas.1802244115. Epub 2018 May 7.
10
STIC: Predicting Single Nucleotide Variants and Tumor Purity in Cancer Genome.STIC:预测癌症基因组中的单核苷酸变异和肿瘤纯度。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Nov-Dec;18(6):2692-2701. doi: 10.1109/TCBB.2020.2975181. Epub 2021 Dec 8.

引用本文的文献

1
A detailed analysis of second and third-generation sequencing approaches for accurate length determination of short tandem repeats and homopolymers.用于精确测定短串联重复序列和同聚物长度的第二代和第三代测序方法的详细分析。
Nucleic Acids Res. 2025 Feb 27;53(5). doi: 10.1093/nar/gkaf131.
2
MEM: An Algorithm for the Reliable Detection of Microsatellite Instability (MSI) on a Small NGS Panel in Colorectal Cancer.MEM:一种用于在小型二代测序(NGS)面板上可靠检测结直肠癌微卫星不稳定性(MSI)的算法
Cancers (Basel). 2021 Aug 20;13(16):4203. doi: 10.3390/cancers13164203.
3
Main findings and advances in bioinformatics and biomedical engineering- IWBBIO 2018.

本文引用的文献

1
Patterns of microsatellite distribution across eukaryotic genomes.真核生物基因组中微卫星的分布模式。
BMC Genomics. 2019 Feb 22;20(1):153. doi: 10.1186/s12864-019-5516-5.
2
MSIpred: a python package for tumor microsatellite instability classification from tumor mutation annotation data using a support vector machine.MSIpred:一个使用支持向量机从肿瘤突变注释数据中进行肿瘤微卫星不稳定性分类的 Python 包。
Sci Rep. 2018 Dec 3;8(1):17546. doi: 10.1038/s41598-018-35682-z.
3
MIRMMR: binary classification of microsatellite instability using methylation and mutations.
生物信息学和生物医学工程的主要发现和进展——IWBBIO 2018。
BMC Bioinformatics. 2020 May 5;21(Suppl 7):153. doi: 10.1186/s12859-020-3467-0.
MIRMMR:基于甲基化和突变的微卫星不稳定性的二分类。
Bioinformatics. 2017 Dec 1;33(23):3799-3801. doi: 10.1093/bioinformatics/btx507.
4
Response to PD-1 Blockade in Microsatellite Stable Metastatic Colorectal Cancer Harboring a Mutation.微卫星稳定转移性结直肠癌伴有突变的 PD-1 阻断治疗反应
J Natl Compr Canc Netw. 2017 Feb;15(2):142-147. doi: 10.6004/jnccn.2017.0016.
5
Performance evaluation for rapid detection of pan-cancer microsatellite instability with MANTIS.使用MANTIS快速检测泛癌微卫星不稳定性的性能评估
Oncotarget. 2017 Jan 31;8(5):7452-7463. doi: 10.18632/oncotarget.13918.
6
Microsatellite instability of gastric cancer and precancerous lesions.胃癌及癌前病变的微卫星不稳定性
Int J Clin Exp Med. 2015 Nov 15;8(11):21138-44. eCollection 2015.
7
MSIseq: Software for Assessing Microsatellite Instability from Catalogs of Somatic Mutations.MSIseq:用于从体细胞突变目录评估微卫星不稳定性的软件。
Sci Rep. 2015 Aug 26;5:13321. doi: 10.1038/srep13321.
8
Evolving approach and clinical significance of detecting DNA mismatch repair deficiency in colorectal carcinoma.结直肠癌中DNA错配修复缺陷检测的进展及临床意义
Semin Diagn Pathol. 2015 Sep;32(5):352-61. doi: 10.1053/j.semdp.2015.02.018. Epub 2015 Feb 4.
9
Complex MSH2 and MSH6 mutations in hypermutated microsatellite unstable advanced prostate cancer.高度微卫星不稳定的晚期前列腺癌中复杂的MSH2和MSH6突变
Nat Commun. 2014 Sep 25;5:4988. doi: 10.1038/ncomms5988.
10
Microsatellite instability detection by next generation sequencing.通过下一代测序检测微卫星不稳定性
Clin Chem. 2014 Sep;60(9):1192-9. doi: 10.1373/clinchem.2014.223677. Epub 2014 Jun 30.