Super learner
This user is a Super Learner. To become a Super Learner, you need to reach Level 8.
Last answered:

11 May 2025

Posted on:

11 May 2025

0

Resolved: Choosing the correct algorithm

I know that there's different algorithms used for d-reduction, depending  on whether it's linear or non linear data, how can we know if it's linear or not if the dimensions are a lot .
1 answers ( 1 marked as helpful)
Instructor
Posted on:

11 May 2025

0
Hi Doaa!
Great question!
One way is to run PCA and look at how many components are needed to explain most of the variance. For example, if your data has 500 dimensions and the first 10 or 20 components explain 90% of the variance, it likely lies on a linear subspace and PCA is a good choice. But if you need 100 or more components to get there, that suggests the data has more complex, nonlinear structure. You can also try projecting the data to 2D using both PCA and a nonlinear method like t-SNE or UMAP. If the nonlinear methods show clearer clusters or shapes, it's another sign that your data isn't well captured by linear techniques.
Hope this helps.
Best,
Ivan

Submit an answer