In the lesson Data Preparation. Preprocessing Discrete Variables: Creating dummies ,
For the variable ‘addr_state’ it was explained that grouping into categories for the final model has to be done according to the WoE values keeping into the consideration of the No.of observations(Borrowers)
But , in the homework problem , when we try to preprocess the discrete variable ‘purpose’ shouldn’t we group them just as above , then into the following categories:
# small_business, educational, moving ,house
# renewable_energy, medical, wedding, vacation
But the solution is given as :
# We combine ‘educational’, ‘small_business’, ‘wedding’, ‘renewable_energy’, ‘moving’, ‘house’ in one category: ‘educ__sm_b__wedd__ren_en__mov__house’.
# We combine ‘other’, ‘medical’, ‘vacation’ in one category: ‘oth__med__vacation’.
# We combine ‘major_purchase’, ‘car’, ‘home_improvement’ in one category: ‘major_purch__car__home_impr’.
# We leave ‘debt_consolidtion’ in a separate category.
# We leave ‘credit_card’ in a separate category.
Can you just explain me on what basis the above grouping is being done?
thank you for the query and apologies for the late response! Are you still experiencing difficulties with the credit risk modeling lecture?
Thank you in advance!