Suppr超能文献

一种从肿瘤来源的RNA测序数据中获得可靠遗传血统估计值的分析流程。

An Analytic Pipeline to Obtain Reliable Genetic Ancestry Estimates from Tumor-Derived RNA Sequencing Data.

作者信息

Johnson Courtney E, Ran Ximing, Wrobel Julia, Davidson Natalie R, Greene Casey S, Epstein Michael P, Marks Jeffrey R, Peres Lauren C, Doherty Jennifer A, Schildkraut Joellen M

机构信息

Department of Epidemiology, Rollins School of Public Health, Emory University, Atlanta, Georgia.

Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, Georgia.

出版信息

Cancer Epidemiol Biomarkers Prev. 2025 Sep 2;34(9):1593-1599. doi: 10.1158/1055-9965.EPI-25-0371.

Abstract

BACKGROUND

Germline genetics may influence tumor molecular characteristics and ultimately cancer survival. Studies of tumor characteristics, including our epithelial ovarian cancer (EOC) studies of Black women in the United States, may have RNA sequencing (RNA-seq) data from archival tumor tissue but lack germline DNA for at least some individuals. Incomplete germline DNA measurements impede analyses of important measures such as global genetic ancestry, often used in downstream analyses, by reducing sample sizes.

METHODS

The study population consists of 184 women who participated in two population-based studies of EOC with both germline and formalin-fixed, paraffin-embedded (FFPE) tumor samples and an additional 58 women diagnosed with EOC from the same two studies with only FFPE tumor tissue. We used tumor RNA-seq data to calculate proportions of African, European, and Asian genetic ancestry using a pipeline built on the packages SeqKit, HISAT2, SAMtools, BCFtools, PLINK, and ADMIXTURE. Women from the 1000 Genomes Project were used as the reference populations, and germline genetic ancestry estimates from blood or saliva were used as the baseline comparison. We evaluated multiple quality control strategies to improve genetic ancestry estimation.

RESULTS

Correlations between tumor RNA-seq-derived estimates of genetic ancestry from our pipeline and germline-derived African and European genetic ancestry ranged between 0.76 and 0.94.

CONCLUSIONS

RNA-seq data from archival FFPE tumor tissue can be confidently and efficiently used to approximate global genetic ancestry in an admixed population when germline DNA is unavailable.

IMPACT

This approach supports analyses of genetic ancestry and cancer when germline samples are not available.

摘要

背景

种系遗传学可能影响肿瘤分子特征并最终影响癌症患者的生存。对肿瘤特征的研究,包括我们对美国黑人女性上皮性卵巢癌(EOC)的研究,可能有来自存档肿瘤组织的RNA测序(RNA-seq)数据,但至少对一些个体而言缺乏种系DNA。种系DNA测量不完整会减少样本量,从而妨碍对下游分析中常用的重要指标(如全球遗传血统)的分析。

方法

研究人群包括184名参与两项基于人群的EOC研究的女性,她们既有种系样本,也有福尔马林固定、石蜡包埋(FFPE)的肿瘤样本,另外还有58名来自同一两项研究且仅拥有FFPE肿瘤组织的EOC确诊女性。我们使用肿瘤RNA-seq数据,通过基于SeqKit、HISAT2、SAMtools、BCFtools、PLINK和ADMIXTURE软件包构建的流程来计算非洲、欧洲和亚洲遗传血统的比例。来自千人基因组计划的女性被用作参考人群,来自血液或唾液的种系遗传血统估计值被用作基线比较。我们评估了多种质量控制策略以改进遗传血统估计。

结果

我们流程中肿瘤RNA-seq衍生的遗传血统估计值与种系衍生的非洲和欧洲遗传血统之间的相关性在0.76至0.94之间。

结论

当无法获得种系DNA时,存档FFPE肿瘤组织的RNA-seq数据可被可靠且高效地用于估算混合人群中的全球遗传血统。

影响

当无法获得种系样本时,这种方法支持对遗传血统和癌症的分析。

相似文献

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验