Credit Risk Modeling in Python Section PD Model 6.2

Question

Hello, when importing the variables to build the PD Model, it returns me the following error pointing the problem to the line ·127 'mths_since_last_record:>86']]"

KeyError: 'Passing list-likes to .loc or [] with any missing labels is no longer supported, see https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#deprecate-loc-reindex-listlike'

Answer 1

HI Edu, Did you use the code provided in the lecture? If you made some changes, could you please share them with us? Best,
The 365 Team

Answer 2

I had the same problem, even when I copied and pasted the lecture code. It turns out that using .loc[ ] with one or more missing labels is now deprecated (starting in version 0.2.1.0). The documentation (https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#indexing-deprecate-loc-reindex-listlike) suggests using .reindex() instead. This worked for me: inputs_train_with_ref_cat = loan_data_inputs_train.reindex([ list of column names ], axis=1)

Answer 3

Hello all, I replaced .loc with .reindex and I get after inputs_train = inputs_train_with_ref_cat.drop(ref_categories, axis = 1)
inputs_train.head() Many NaN:

grade:A                              0
grade:B                              0
grade:C                              0
grade:D                              0
grade:E                              0
...  
mths_since_last_record:3-20     373028
mths_since_last_record:21-31    373028
mths_since_last_record:32-80    373028
mths_since_last_record:81-86    373028
mths_since_last_record:>=86     373028
Length: 104, dtype: int64

It seems, that it doesn't work really with .reindex. 

If I open the data loan_data_inputs_train.csv and all other, I do not find any NaN. All data are normal. 

Any idea?

Best regard
Volkmar Meiller

Answer 4

Yes, I agree with Volkmar about retaining the indexes.
The best approach to this is using List comprehensions instead of Columns List . My solution can be found on the below link :
https://stackoverflow.com/a/65146867/14605502
Hope it helps!

Answer 5

Ran into the same problem but it turned out i hadn`t been paying attention:

The code in the section 4_16 has to be run 2 times, with commnting out different parts
first:
df_inputs_prepr = loan_data_inputs_train
df_targets_prepr = loan_data_targets_train
#df_inputs_prepr = loan_data_inputs_test
#df_targets_prepr = loan_data_targets_test
second:
#df_inputs_prepr = loan_data_inputs_train
#df_targets_prepr = loan_data_targets_train
df_inputs_prepr = loan_data_inputs_test
df_targets_prepr = loan_data_targets_test
and same logic applies at the end of the notebook

Then there is no problem with iloc method in 5_2
But I guess you could also do all the variable creation first, and do the train-test splitting right before fitting the model

Credit Risk Modeling in Python Section PD Model 6.2

Submit an answer