Resolved: Complexity of the model
1 question : For mnths_since_last_deliq variable we are creating dummy variable for missing values also, if we are creating a seperate dummy variables after apply fine and coarse classing then what is the need of creating seperate variable for missing values. As they are not providing any information to the model ?
2 question : Why to create so many dummy varibles why can't we directly group them into same variable by applying woe to the varibales, aren't we increasing the complexity by creating these many variables ?
Hi Nilkant,
Thanks for the great questions!
I'll try to answer to my best understanding.
Question 1:
Missing values can carry meaningful information. There can be non-random missing values and hence we could obtain useful insights.
Question 2:
This approach allows the model to learn non-linear relationships that WOE grouping might obscure. But you are right there's a clear trade-off in terms of complexity and we need to balance this.
Best,
Ned