Lazzeri Isaac, Spiegl Benjamin Gernot, Hasenleithner Samantha O, Speicher Michael R, Kircher Martin
Institute of Human Genetics, Diagnostic and Research Center for Molecular BioMedicine, Medical University of Graz, Neue Stiftingtalstr. 6, Graz 8010, Austria.
Division of Oncology, Department of Internal Medicine, Medical University of Graz, 8010 Graz, Austria.
Comput Struct Biotechnol J. 2024 Aug 11;23:3163-3174. doi: 10.1016/j.csbj.2024.08.007. eCollection 2024 Dec.
The analysis of circulating cell-free DNA (cfDNA) holds immense promise as a non-invasive diagnostic tool across various human conditions. However, extracting biological insights from cfDNA fragments entails navigating complex and diverse bioinformatics methods, encompassing not only DNA sequence variation, but also epigenetic characteristics like nucleosome footprints, fragment length, and methylation patterns.
We introduce Liquid Biopsy Feature extract (LBFextract), a comprehensive package designed to streamline feature extraction from cfDNA sequencing data, with the aim of enhancing the reproducibility and comparability of liquid biopsy studies. LBFextract facilitates the integration of preprocessing and postprocessing steps through alignment fragment tags and a hook mechanism. It incorporates various methods, including coverage-based and fragment length-based approaches, alongside two novel feature extraction methods: an entropy-based method to infer TF activity from fragmentomics data and a technique to amplify signals from nucleosome dyads. Additionally, it implements a method to extract condition-specific differentially active TFs based on these features for biomarker discovery. We demonstrate the use of LBFextract for the subtype classification of advanced prostate cancer patients using coverage signals at transcription factor binding sites from cfDNA. We show that LBFextract can generate robust and interpretable features that can discriminate between different clinical groups. LBFextract is a versatile and user-friendly package that can facilitate the analysis and interpretation of liquid biopsy data.
LBFextract is freely accessible at https://github.com/Isy89/LBF. It is implemented in Python and compatible with Linux and Mac operating systems. Code and data to reproduce these analyses have been uploaded to 10.5281/zenodo.10964406.
循环游离DNA(cfDNA)分析作为一种用于各种人类疾病的非侵入性诊断工具,具有巨大的前景。然而,从cfDNA片段中提取生物学见解需要运用复杂多样的生物信息学方法,不仅包括DNA序列变异,还包括诸如核小体足迹、片段长度和甲基化模式等表观遗传特征。
我们推出了液体活检特征提取工具(LBFextract),这是一个综合性软件包,旨在简化从cfDNA测序数据中提取特征,以提高液体活检研究的可重复性和可比性。LBFextract通过比对片段标签和一种钩子机制,促进了预处理和后处理步骤的整合。它整合了各种方法,包括基于覆盖度和基于片段长度的方法,以及两种新颖的特征提取方法:一种基于熵的方法,用于从片段组学数据推断转录因子活性,还有一种从核小体二分体放大信号的技术。此外,它还实现了一种基于这些特征提取特定条件下差异活跃转录因子的方法,用于生物标志物发现。我们展示了使用LBFextract,基于cfDNA转录因子结合位点的覆盖信号对晚期前列腺癌患者进行亚型分类。我们表明,LBFextract可以生成强大且可解释的特征,能够区分不同的临床组。LBFextract是一个多功能且用户友好的软件包,可促进液体活检数据的分析和解释。
LBFextract可在https://github.com/Isy89/LBF上免费获取。它用Python实现,与Linux和Mac操作系统兼容。用于重现这些分析的代码和数据已上传至10.5281/zenodo.10964406。