Dropping a dummy variable - Why?
Why exactly do we need to drop the dummy variable? Because of the mathematical reasons or because reason 0 doesn't give us information? Do we need to do this for all the other data frames that we work? Or is it just for this occasion?
1 answers ( 0 marked as helpful)
Hi Colton,
The reason is we are trying to avoid multicollinearity.
We have this topic covered in one of our other courses (the Data Science Course).
Alternatively, feel free to get familiar with the mathematics of it here (written by your instructor Iliya)
https://www.quora.com/How-and-why-having-the-same-number-of-dummy-variables-as-categories-is-problematic-in-linear-regression-Dummy-variable-trap-Im-looking-for-a-purely-mathematical-not-intuitive-explanation-Also-please-avoid-using-the/answer/Iliya-Valchanov?__filter__=all&__nsrc__=1&__snid3__=3828155711
Best,
The 365 Team