Is it wise to run PCA on all segmentation data-sets before trying to cluster it first?
one of the main uses of PCA is to reduce the dimensions of the data set. That’s especially worthwhile when there are a lot of features in the data, and fewer data points. This leads to a phenomenon, known as the ‘curse of dimensionality’, which can be avoided by employing techniques such as PCA. That’s why we perform PCA before clustering.
Thank you 😀
Reference:”How to Combine PCA and K-means in Python?”
Why does the dimensionality of the matrix get reduced to a 4 X W matrix for the output in the Principal Component Data frame? Are the rows in my original data being dropped?