Zhai Ting, Laverty Daniel J, Nagel Zachary D
Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, MA 02115.
bioRxiv. 2025 Jun 28:2025.06.26.661510. doi: 10.1101/2025.06.26.661510.
We present a targeted sequencing-based pipeline that profiles microsatellite instability (MSI) at single-nucleotide resolution. Targeted amplicons from the five widely studied Bethesda panel microsatellite loci were sequenced using Oxford Nanopore Technology in two microsatellite unstable colorectal cancer cell lines (HCT15, HCT116), two microsatellite stable cancer cell lines (TK6, U2OS), and two peripheral blood mononuclear cell samples from healthy donors. An anchor-extension algorithm was developed to capture repeat motifs while allowing interruptions, using a threshold informed by platform-specific error. Cluster-aware Dirichlet-multinomial and beta-binomial tests were applied for between-sample comparisons while accounting for read-level clustering within samples. The algorithm revealed distinct repeat profiles in HCT15 and HCT116 compared to other cell types and uncovered allelic diversity across samples at different MSI loci. Our approach complements existing short tandem repeat callers by preserving read-level diversity and delivering targeted, quantitative MSI calls with potential applications in mechanistic research and clinical assay development.
我们展示了一种基于靶向测序的流程,该流程可在单核苷酸分辨率下分析微卫星不稳定性(MSI)。使用牛津纳米孔技术对来自五个广泛研究的贝塞斯达小组微卫星位点的靶向扩增子进行测序,样本包括两个微卫星不稳定的结肠癌细胞系(HCT15、HCT116)、两个微卫星稳定的癌细胞系(TK6、U2OS)以及来自健康供体的两份外周血单个核细胞样本。开发了一种锚定延伸算法,以捕获重复基序同时允许中断,使用由平台特定误差确定的阈值。在考虑样本内读段聚类的同时,应用聚类感知狄利克雷多项分布和贝塔二项分布检验进行样本间比较。该算法揭示了与其他细胞类型相比,HCT15和HCT116中独特的重复谱,并发现了不同MSI位点样本间的等位基因多样性。我们的方法通过保留读段水平的多样性,并提供有针对性的、定量的MSI检测结果,对现有的短串联重复序列调用工具进行了补充,在机制研究和临床检测开发中具有潜在应用价值。