Suppr超能文献

在恶性和良性乳腺癌中表达的蛋白质特征有差异吗?

Are there any differences between features of proteins expressed in malignant and benign breast cancers?

作者信息

Ebrahimi Mansour, Ebrahimie Esmaeil, Shamabadi Narges, Ebrahimi Mahdi

机构信息

Bioinformatics Research Group, Green Research Center, Qom University, Qom, Iran.

出版信息

J Res Med Sci. 2010 Nov;15(6):299-309.

Abstract

BACKGROUND

The most common cancer among women is breast cancer and it has been blamed as the second leading cause of cancer death in women; so far many approaches have been used to analyze and detect benign and malignant forms of cancer and understanding the features involved in proteins expressed by various types of breast cancers is crucial.

METHODS

Herein features of proteins expressed in malignant, benign and both cancers were compared using different screening techniques, clustering methods, decision tree models and generalized rule induction (GRI) algorithms to look for patterns of similarity in two benign and malignant breast cancer groups.

RESULTS

The findings showed that the N-terminal amino acid was Met and 57 out of 838 proteins' features ranked as important (p > 0.05). The depth of the trees induced by tree induction models varied from 5 (in the Quest model) to 2 (in the C5.0 model) branches. The best performance evaluation found when C&RT model applied and the worst evaluation found when CHAID model applied. No significant difference in the percentage of correctness, performance evaluation, and mean correctness in tree induction algorithms was found when feature selection applied on datasets, but the number of peer groups reduced significantly (p < 0.05) when feature selection model applied.

CONCLUSIONS

The frequency of Ile-Ile was the most important protein attributes in all tree and rule induction models. The importance of sequence-based classification and the frequency of Ile-Ile in prediction of malignant and benign breast cancer have been discussed here.

摘要

背景

乳腺癌是女性中最常见的癌症,被认为是女性癌症死亡的第二大主要原因;到目前为止,已经采用了许多方法来分析和检测癌症的良性和恶性形式,了解各种类型乳腺癌所表达蛋白质的特征至关重要。

方法

本文使用不同的筛选技术、聚类方法、决策树模型和广义规则归纳(GRI)算法,比较了恶性、良性及两种癌症中表达的蛋白质的特征,以寻找两种良性和恶性乳腺癌组中的相似模式。

结果

研究结果表明,N端氨基酸为甲硫氨酸,838种蛋白质特征中有57种被列为重要特征(p>0.05)。树归纳模型诱导的树的深度从5(在Quest模型中)到2(在C5.0模型中)个分支不等。应用C&RT模型时性能评估最佳,应用CHAID模型时性能评估最差。对数据集应用特征选择时,树归纳算法在正确率百分比、性能评估和平均正确率方面没有显著差异,但应用特征选择模型时,同龄组数量显著减少(p<0.05)。

结论

在所有树和规则归纳模型中,异亮氨酸-异亮氨酸的频率是最重要的蛋白质属性。本文讨论了基于序列的分类的重要性以及异亮氨酸-异亮氨酸的频率在预测恶性和良性乳腺癌中的作用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c9ad/3082830/e4ba3f4bd95a/JRMS-15-299-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验