Dropping a dummy variable - Why?

Question

Why exactly do we need to drop the dummy variable? Because of the mathematical reasons or because reason 0 doesn't give us information? Do we need to do this for all the other data frames that we work? Or is it just for this occasion?

Answer 1

Hi Colton,

The reason is we are trying to avoid multicollinearity.

We have this topic covered in one of our other courses (the Data Science Course).

Alternatively, feel free to get familiar with the mathematics of it here (written by your instructor Iliya)

https://www.quora.com/How-and-why-having-the-same-number-of-dummy-variables-as-categories-is-problematic-in-linear-regression-Dummy-variable-trap-Im-looking-for-a-purely-mathematical-not-intuitive-explanation-Also-please-avoid-using-the/answer/Iliya-Valchanov?__filter__=all&;__nsrc__=1&__snid3__=3828155711

Best,
The 365 Team

Dropping a dummy variable - Why?

Submit an answer