一种在多变量数据集中有效选择变量的预测方法。

A forecasting method with efficient selection of variables in multivariate data sets.

作者信息

Sagar Pinki, Gupta Prinima, Kashyap Indu

机构信息

Manav Rachna International Institute of Research and Studies, Faridabad, Haryana India.

出版信息

Int J Inf Technol. 2021;13(3):1039-1046. doi: 10.1007/s41870-021-00619-9. Epub 2021 Feb 28.

DOI:10.1007/s41870-021-00619-9

PMID:33681697

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7914390/

Abstract

Regression is a kind of data analysis technique in which the relationship between the independent variable(x) and dependent variable(y) is modeled and for polynomial regression it is up to the nth degree polynomial. Polynomial regression fits a nonlinear relationship between the value of x and the corresponding conditional mean of y, denoted by E (y|x). In this paper polynomial regression analysis has been improved through efficient selection of variables that is coefficient of determination. Coefficient of determination is a square of the correlation between new predicted y values and actual y values and its values are in the range from 0 to 1. The main purpose of regression analysis is to discover the relationship among the independent and dependent variables or in other words it is an explanation of variation in one variable with another variable. In this paper, the main focus is on Multivariate data sets that have many attributes and it is not necessary that all variables are required for data analysis purposes. Using coefficient of determination (COD) irrelevant attributes get eliminated during analysis. The main objective of research is to reduce the cost of data maintenance, reduce the execution time and improve the prediction accuracy rate. COD helps in selecting suitable independent variables. It is a notch that is used in statistical analysis that assesses how well a model explains and forecasts upcoming outcomes. This method also helps in eliminating the irrelevant variables which are not required for the prediction model by this maintenance cost and size of data sets can be reduced.

摘要

回归是一种数据分析技术，其中对自变量（x）和因变量（y）之间的关系进行建模，对于多项式回归，它是最高到n次多项式。多项式回归拟合x值与y的相应条件均值（用E(y|x)表示）之间的非线性关系。在本文中，通过有效选择变量（即决定系数）对多项式回归分析进行了改进。决定系数是新预测的y值与实际y值之间相关性的平方，其值范围为0到1。回归分析的主要目的是发现自变量和因变量之间的关系，或者换句话说，它是用另一个变量解释一个变量的变化。在本文中，主要关注具有许多属性的多变量数据集，并且并非所有变量都必然是数据分析所必需的。使用决定系数（COD）在分析过程中会消除无关属性。研究的主要目标是降低数据维护成本、减少执行时间并提高预测准确率。COD有助于选择合适的自变量。它是一种用于统计分析的指标，评估模型对未来结果的解释和预测能力。这种方法还有助于消除预测模型不需要的无关变量，由此可以降低数据集的维护成本和规模。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

一种在多变量数据集中有效选择变量的预测方法。

A forecasting method with efficient selection of variables in multivariate data sets.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

一种在多变量数据集中有效选择变量的预测方法。

A forecasting method with efficient selection of variables in multivariate data sets.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献