Resolved: Different approach for making dummy variables
Hi,
I have followed a different approach for dealing with dummy variables, which is using incrementing values when the number of unique value are more than 2. For instance, for the brand feature, I increment starting from 0 to 6 to create a key-value pairs like below example:
{
Audi : 0,
Mercedes: 1,
Toyota : 2
}
Then I just map the above dict with my dataframe and finally use the resulting one column as a dummy feature in my model. I have gotten a pretty good results with this (R-square = 0.83 and the mean residual of y_test - y_hat is around 0.017).
What do you think of my approach?
Thanks
Hey Mohamed,
Thank you for reaching out!
The approach you've followed is indeed completely valid. In fact, there is a transformer in sklearn
called OrdinalEncoder which does exactly that :)
Kind regards,
365 Hristina