Resolved: Choosing the correct algorithm

Question

I know that there's different algorithms used for d-reduction, depending on whether it's linear or non linear data, how can we know if it's linear or not if the dimensions are a lot .

Answer 1

Hi Doaa!
Great question!
One way is to run PCA and look at how many components are needed to explain most of the variance. For example, if your data has 500 dimensions and the first 10 or 20 components explain 90% of the variance, it likely lies on a linear subspace and PCA is a good choice. But if you need 100 or more components to get there, that suggests the data has more complex, nonlinear structure. You can also try projecting the data to 2D using both PCA and a nonlinear method like t-SNE or UMAP. If the nonlinear methods show clearer clusters or shapes, it's another sign that your data isn't well captured by linear techniques.
Hope this helps.
Best,
Ivan

Resolved: Choosing the correct algorithm

Submit an answer