Suppr超能文献

一种用于微阵列交叉杂交的多变量预测模型。

A multivariate prediction model for microarray cross-hybridization.

作者信息

Chen Yian A, Chou Cheng-Chung, Lu Xinghua, Slate Elizabeth H, Peck Konan, Xu Wenying, Voit Eberhard O, Almeida Jonas S

机构信息

Department of Biostatistics, Bioinformatics, and Epidemiology, Medical University of South Carolina, Charleston, SC, USA.

出版信息

BMC Bioinformatics. 2006 Mar 1;7:101. doi: 10.1186/1471-2105-7-101.

Abstract

BACKGROUND

Expression microarray analysis is one of the most popular molecular diagnostic techniques in the post-genomic era. However, this technique faces the fundamental problem of potential cross-hybridization. This is a pervasive problem for both oligonucleotide and cDNA microarrays; it is considered particularly problematic for the latter. No comprehensive multivariate predictive modeling has been performed to understand how multiple variables contribute to (cross-) hybridization.

RESULTS

We propose a systematic search strategy using multiple multivariate models [multiple linear regressions, regression trees, and artificial neural network analyses (ANNs)] to select an effective set of predictors for hybridization. We validate this approach on a set of DNA microarrays with cytochrome p450 family genes. The performance of our multiple multivariate models is compared with that of a recently proposed third-order polynomial regression method that uses percent identity as the sole predictor. All multivariate models agree that the 'most contiguous base pairs between probe and target sequences,' rather than percent identity, is the best univariate predictor. The predictive power is improved by inclusion of additional nonlinear effects, in particular target GC content, when regression trees or ANNs are used.

CONCLUSION

A systematic multivariate approach is provided to assess the importance of multiple sequence features for hybridization and of relationships among these features. This approach can easily be applied to larger datasets. This will allow future developments of generalized hybridization models that will be able to correct for false-positive cross-hybridization signals in expression experiments.

摘要

背景

表达微阵列分析是后基因组时代最流行的分子诊断技术之一。然而,该技术面临潜在交叉杂交这一基本问题。这是寡核苷酸微阵列和cDNA微阵列普遍存在的问题;对于后者而言,该问题被认为尤其棘手。尚未进行全面的多变量预测建模来了解多个变量如何导致(交叉)杂交。

结果

我们提出一种系统的搜索策略,使用多种多变量模型[多元线性回归、回归树和人工神经网络分析(ANN)]来选择一组有效的杂交预测因子。我们在一组含有细胞色素p450家族基因的DNA微阵列上验证了该方法。将我们的多种多变量模型的性能与最近提出的一种使用序列一致性百分比作为唯一预测因子的三阶多项式回归方法的性能进行了比较。所有多变量模型均认为,“探针与靶序列之间最连续的碱基对”而非序列一致性百分比是最佳单变量预测因子。当使用回归树或人工神经网络时,通过纳入额外的非线性效应,特别是靶标GC含量,预测能力得到了提高。

结论

提供了一种系统的多变量方法来评估多个序列特征对杂交的重要性以及这些特征之间的关系。该方法可以轻松应用于更大的数据集。这将为未来广义杂交模型的发展提供可能,该模型将能够校正表达实验中假阳性交叉杂交信号。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5f5e/1409802/10df6232b1b2/1471-2105-7-101-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验