Suppr超能文献

石蜡包埋样本全基因组基因表达谱分析的质量评估指标

Quality assessment metrics for whole genome gene expression profiling of paraffin embedded samples.

作者信息

Mahoney Douglas W, Therneau Terry M, Anderson S Keith, Jen Jin, Kocher Jean-Pierre A, Reinholz Monica M, Perez Edith A, Eckel-Passow Jeanette E

机构信息

Biomedical Statistics and Informatics, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA.

出版信息

BMC Res Notes. 2013 Jan 30;6:33. doi: 10.1186/1756-0500-6-33.

Abstract

BACKGROUND

Formalin fixed, paraffin embedded tissues are most commonly used for routine pathology analysis and for long term tissue preservation in the clinical setting. Many institutions have large archives of Formalin fixed, paraffin embedded tissues that provide a unique opportunity for understanding genomic signatures of disease. However, genome-wide expression profiling of Formalin fixed, paraffin embedded samples have been challenging due to RNA degradation. Because of the significant heterogeneity in tissue quality, normalization and analysis of these data presents particular challenges. The distribution of intensity values from archival tissues are inherently noisy and skewed due to differential sample degradation raising two primary concerns; whether a highly skewed array will unduly influence initial normalization of the data and whether outlier arrays can be reliably identified.

FINDINGS

Two simple extensions of common regression diagnostic measures are introduced that measure the stress an array undergoes during normalization and how much a given array deviates from the remaining arrays post-normalization. These metrics are applied to a study involving 1618 formalin-fixed, paraffin-embedded HER2-positive breast cancer samples from the N9831 adjuvant trial processed with Illumina's cDNA-mediated Annealing Selection extension and Ligation assay.

CONCLUSION

Proper assessment of array quality within a research study is crucial for controlling unwanted variability in the data. The metrics proposed in this paper have direct biological interpretations and can be used to identify arrays that should either be removed from analysis all together or down-weighted to reduce their influence in downstream analyses.

摘要

背景

福尔马林固定、石蜡包埋组织最常用于临床环境中的常规病理学分析和长期组织保存。许多机构拥有大量福尔马林固定、石蜡包埋组织档案,这为了解疾病的基因组特征提供了独特机会。然而,由于RNA降解,对福尔马林固定、石蜡包埋样本进行全基因组表达谱分析一直具有挑战性。由于组织质量存在显著异质性,这些数据的标准化和分析面临特殊挑战。由于样本降解差异,存档组织强度值的分布本质上是有噪声且偏态的,这引发了两个主要问题:高度偏态的阵列是否会过度影响数据的初始标准化,以及异常值阵列是否能被可靠识别。

研究结果

引入了两种常见回归诊断方法的简单扩展,用于衡量阵列在标准化过程中所承受的压力,以及给定阵列在标准化后与其余阵列的偏离程度。这些指标应用于一项研究,该研究涉及来自N9831辅助试验的1618个福尔马林固定、石蜡包埋的HER2阳性乳腺癌样本,这些样本采用Illumina的cDNA介导的退火选择延伸和连接测定法进行处理。

结论

在研究中正确评估阵列质量对于控制数据中不必要的变异性至关重要。本文提出的指标具有直接的生物学解释,可用于识别应从分析中完全剔除或降低权重以减少其在下游分析中影响的阵列。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d5c3/3626608/58df84c12efb/1756-0500-6-33-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验