Different dataset for credit risk modelling course
The dataset i downloaded from the mentioned dropbox llinks, is different. like for earliest_cr line the values in the video are 'Nov-01' while the value in the dataset is 'Nov-20'. What can be done to avoid this?
2 answers ( 0 marked as helpful)
Hi Kunjan,
Please refer to the credit risk modelling files available here: https://www.dropbox.com/sh/7oslws1xhsm1zbf/AABkdWDKqpdcGmY1NbXAnkrBa?dl=0
Best,
The 365 Team
It looks like some of the data was inputted in a single year format, this causes programs to assume it means something else. For example Nov-01 becomes Nov-1, which causes it to think it means 11/1/currentyear.
Use loan_data['earliest_cr_line'].replace({"^(\d-.*)$": "0\\g<1>", "^(.*)-(\d)$": "0\\g<2>-\\g<1>"}, regex=True, inplace=True) to make the above example Nov-01 again.