Credit Risk Modeling PD Model Estimation

Question

When estimating the PD Model with Logistic Regression I am getting a Value error on the lecture code: "

ValueError                                Traceback (most recent call last)
<ipython-input-32-966f89d3c717> in <module>
----> 1 reg.fit(inputs_train, loan_data_targets_train)
      2 # Estimates the coefficients of the object from the 'LogisticRegression' class
      3 # with inputs (independent variables) contained in the first dataframe
      4 # and targets (dependent variables) contained in the second dataframe.

~/opt/anaconda3/lib/python3.7/site-packages/sklearn/linear_model/_logistic.py in fit(self, X, y, sample_weight)
   1525 
   1526         X, y = check_X_y(X, y, accept_sparse='csr', dtype=_dtype, order="C",
-> 1527                          accept_large_sparse=solver != 'liblinear')
   1528         check_classification_targets(y)
   1529         self.classes_ = np.unique(y)

~/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, warn_on_dtype, estimator)
    753                     ensure_min_features=ensure_min_features,
    754                     warn_on_dtype=warn_on_dtype,
--> 755                     estimator=estimator)
    756     if multi_output:
    757         y = check_array(y, 'csr', force_all_finite=True, ensure_2d=False,

~/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)
    576         if force_all_finite:
    577             _assert_all_finite(array,
--> 578                                allow_nan=force_all_finite == 'allow-nan')
    579 
    580     if ensure_min_samples > 0:

~/opt/anaconda3/lib/python3.7/site-packages/sklearn/utils/validation.py in _assert_all_finite(X, allow_nan, msg_dtype)
     58                     msg_err.format
     59                     (type_err,
---> 60                      msg_dtype if msg_dtype is not None else X.dtype)
     61             )
     62     # for object dtype data, we only check for NaNs (GH-13254)

ValueError: Input contains NaN, infinity or a value too large for dtype('float64')"
It works fine in the video lecture, but the same code won't run in jupyter. Can you help me please?

Answer 1

Hi Mariam, thanks for reaching out and sorry for the late response. Could you please provide a lecture link to the video, where you're experiencing difficulties? Thanks so much in advance! Best, 365 Eli

Answer 2

Hi
I have the same error while fitting the logistic regression model. It is under PD model estimation

Credit Risk Modeling PD Model Estimation

Submit an answer