Resolved: Different approach for making dummy variables
I have followed a different approach for dealing with dummy variables, which is using incrementing values when the number of unique value are more than 2. For instance, for the brand feature, I increment starting from 0 to 6 to create a key-value pairs like below example:
Audi : 0,
Toyota : 2
Then I just map the above dict with my dataframe and finally use the resulting one column as a dummy feature in my model. I have gotten a pretty good results with this (R-square = 0.83 and the mean residual of y_test - y_hat is around 0.017).
What do you think of my approach?
Thank you for reaching out!
The approach you've followed is indeed completely valid. In fact, there is a transformer in
sklearn called OrdinalEncoder which does exactly that :)