Last answered:

04 Jan 2023

Posted on:

05 Dec 2022


Key error with mean imputation


I am running a Jupyter notebook with the code from the video on my machine (locally) and I get an error when I run the code for the mean imputation (3:00 in the video). Here is the error message:

KeyError                                  Traceback (most recent call last)
Input In [15], in <cell line: 6>()
      1 # Mean. One thing to note is we want to impute the testing set with the mean of the 
      2 # training set. So ideally if we're using the mean of the entire dataset and using that
      3 # to impute our training set, we are actually going to have leaking data. 
      5 x_train_m.loc[:,"age"] = x_train_m["age"].fillna(np.mean(x_train_m["age"]))
----> 6 x_test_m.loc[:,"age"] = x_test_m["age"].fillna(np.mean(x_train_m["age"]))    # Training set mean.
      8 x_train_m.loc[:,"days_on_platform"] = x_train_m["days_on_platform"].fillna(np.mean(x_train_m["days_on_platform"]))
      9 x_test_m.loc[:,"days_on_platform"] = x_test_m["days_on_platform"].fillna(np.mean(x_train_m["days_on_platform"]))

File ~/.virtualenvs/ds/lib/python3.8/site-packages/pandas/core/, in Series.__getitem__(self, key)
    955     return self._values[key]
    957 elif key_is_scalar:
--> 958     return self._get_value(key)
    960 if is_hashable(key):
    961     # Otherwise index.get_value will raise InvalidIndexError
    962     try:
    963         # For labels that don't resolve as scalars like tuples and frozensets

File ~/.virtualenvs/ds/lib/python3.8/site-packages/pandas/core/, in Series._get_value(self, label, takeable)
   1066     return self._values[label]
   1068 # Similar to Index.get_value, but we do not fall back to positional
-> 1069 loc = self.index.get_loc(label)
   1070 return self.index._get_values_for_loc(self, loc, label)

File ~/.virtualenvs/ds/lib/python3.8/site-packages/pandas/core/indexes/, in RangeIndex.get_loc(self, key, method, tolerance)
    387             raise KeyError(key) from err
    388     self._check_indexing_error(key)
--> 389     raise KeyError(key)
    390 return super().get_loc(key, method=method, tolerance=tolerance)

KeyError: 'age'

I have tried to figure it out but I don't see any difference between my code and the code in the video. Any feedback? Is there maybe anything I need to install?

Thank you in advance and regards,

3 answers ( 0 marked as helpful)
Posted on:

06 Dec 2022


I have replaced my code with the one found in resources and now it works. I didn't get to figure out why, but it is solved.

Thank you.

Posted on:

28 Dec 2022


glad you solved it isabel! may be an error in our video walk through

Posted on:

04 Jan 2023


Thank you Jeff.

Submit an answer