Principal component analysis (PCA) is a mathematical procedure that uses an orthogonal transformation to transform a series of observations of the variables may be correlated with the values specified in a non-correlated variables called principal components. Number of principal components is less than or equal to the original number of variables. This transition has been determined in such a way that the first principal component is as high a variance as possible (ie, the variability of the data as many bills as possible), and each succeeding element of the standard deviation is the highest possible under the constraint that it is perpendicular (not correlated) in the previous parts. Independent principal components are guaranteed only if the series jointly normally distributed. PCA is sensitive to the relative scaling of the original variables. Depending on the scope, it is also known as the discrete Karhunen-Loève transform (KLT), the Hotelling or less, or the proper orthogonal resolution (POD).
PCA was found in 1901, Karl Pearson. Now mostly as a tool in exploratory data analysis and for making predictive models. PCA can be done eigenvalue decomposition of data covariance matrix or singular value resolution of the data matrix, usually after mean centering the data for each attribute. The results are usually discussed in the PCA component scores (transformed to the variable value to a particular case, the data) and loads (the weight with which each original variable must be multiplied by a standarized to the component score).
PCA is the simplest to the real eigenvector-based multivariate analysis. Often, it is thought the operation revealed the internal structure of the data in a way that best explains the variance in the data. If you see a multivariate data set a series of coordinates in a high-dimensional data space (1 axis is a variable), PCA can provide the user with a lower-dimensional image, a "shadow" of the object in terms of the (in some sense) the most informative point of view. This is done by using only the first few principal components, so that the dimension of the transformed data is reduced.
PCA is closely related to factor analysis, and in some statistical packages (such as Stata) knowingly combining the two techniques. True factor analysis makes different assumptions about the underlying structure of the eigenvectors, and solves a slightly different matrix.
PCA was found in 1901, Karl Pearson. Now mostly as a tool in exploratory data analysis and for making predictive models. PCA can be done eigenvalue decomposition of data covariance matrix or singular value resolution of the data matrix, usually after mean centering the data for each attribute. The results are usually discussed in the PCA component scores (transformed to the variable value to a particular case, the data) and loads (the weight with which each original variable must be multiplied by a standarized to the component score).
PCA is the simplest to the real eigenvector-based multivariate analysis. Often, it is thought the operation revealed the internal structure of the data in a way that best explains the variance in the data. If you see a multivariate data set a series of coordinates in a high-dimensional data space (1 axis is a variable), PCA can provide the user with a lower-dimensional image, a "shadow" of the object in terms of the (in some sense) the most informative point of view. This is done by using only the first few principal components, so that the dimension of the transformed data is reduced.
PCA is closely related to factor analysis, and in some statistical packages (such as Stata) knowingly combining the two techniques. True factor analysis makes different assumptions about the underlying structure of the eigenvectors, and solves a slightly different matrix.
No comments:
Post a Comment