Last answered:

04 Nov 2022

Posted on:

04 Nov 2022

1

Resolved: Different approach for making dummy variables

Hi,

I have followed a different approach for dealing with dummy variables, which is using incrementing values when the number of unique value are more than 2. For instance, for the brand feature, I increment starting from 0 to 6 to create a key-value pairs like below example:
{
    Audi : 0,
    Mercedes: 1,
    Toyota : 2
}
Then I just map the above dict with my dataframe and finally use the resulting  one column  as a dummy feature in my model. I have gotten a pretty good results with this (R-square = 0.83 and the mean residual of y_test - y_hat is around 0.017).

What do you think of my approach?

Thanks

1 answers ( 1 marked as helpful)
Instructor
Posted on:

04 Nov 2022

0

Hey Mohamed,

Thank you for reaching out!

The approach you've followed is indeed completely valid. In fact, there is a transformer in sklearn called OrdinalEncoder which does exactly that :)

Kind regards,
365 Hristina

Submit an answer