Super learner
This user is a Super Learner. To become a Super Learner, you need to reach Level 8.
Resolved: Choosing the correct algorithm
in
Linear Algebra and Feature Selection
/
A Step-by-Step Explanation of PCA on California Estates – Example
I know that there's different algorithms used for d-reduction, depending on whether it's linear or non linear data, how can we know if it's linear or not if the dimensions are a lot .
1 answers ( 1 marked as helpful)
Hi Doaa!
Great question!
One way is to run PCA and look at how many components are needed to explain most of the variance. For example, if your data has 500 dimensions and the first 10 or 20 components explain 90% of the variance, it likely lies on a linear subspace and PCA is a good choice. But if you need 100 or more components to get there, that suggests the data has more complex, nonlinear structure. You can also try projecting the data to 2D using both PCA and a nonlinear method like t-SNE or UMAP. If the nonlinear methods show clearer clusters or shapes, it's another sign that your data isn't well captured by linear techniques.
Hope this helps.
Best,
Ivan
Great question!
One way is to run PCA and look at how many components are needed to explain most of the variance. For example, if your data has 500 dimensions and the first 10 or 20 components explain 90% of the variance, it likely lies on a linear subspace and PCA is a good choice. But if you need 100 or more components to get there, that suggests the data has more complex, nonlinear structure. You can also try projecting the data to 2D using both PCA and a nonlinear method like t-SNE or UMAP. If the nonlinear methods show clearer clusters or shapes, it's another sign that your data isn't well captured by linear techniques.
Hope this helps.
Best,
Ivan