Resolved: KeyError: "['Pets', 'Month_value'] not in index"
While running code :
model.load_and_clean_data('Absenteeism_new_data.csv')
I am getting error of:
KeyError: "['Pets', 'Month_value'] not in index"
I tried to add arguement delim_whitespace = True in pd.read_csv code in module file so error changed from indexer to ID not found. I am stcuk here please solve my this query as soon as possible.
pd.read_csv("wspace.csv", header=None, delim_whitespace=True)
I am writing the full error which is occurring for the code:
model.load_and_clean_data('Absenteeism_new_data.csv')
KeyError Traceback (most recent call last)
<ipython-input-4-31f66a937b22> in <module>
----> 1 model.load_and_clean_data('Absenteeism_new_data.csv')
~\ML projects\absenteeism_module.py in load_and_clean_data(self, data_file)
128
129 # we need this line so we can use it in the next functions
--> 130 self.data = self.scaler.transform(df)
131
132 # a function which outputs the probability of a data point to be 1
~\ML projects\absenteeism_module.py in transform(self, X, y, copy)
29 def transform(self, X, y=None, copy=None):
30 init_col_order = X.columns
---> 31 X_scaled = pd.DataFrame(self.scaler.transform(X[self.columns]), columns=self.columns)
32 X_not_scaled = X.loc[:,~X.columns.isin(self.columns)]
33 return pd.concat([X_not_scaled, X_scaled], axis=1)[init_col_order]
~\anaconda3\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
2910 if is_iterator(key):
2911 key = list(key)
-> 2912 indexer = self.loc._get_listlike_indexer(key, axis=1, raise_missing=True)[1]
2913
2914 # take() does not accept boolean indexers
~\anaconda3\lib\site-packages\pandas\core\indexing.py in _get_listlike_indexer(self, key, axis, raise_missing)
1252 keyarr, indexer, new_indexer = ax._reindex_non_unique(keyarr)
1253
-> 1254 self._validate_read_indexer(keyarr, indexer, axis, raise_missing=raise_missing)
1255 return keyarr, indexer
1256
~\anaconda3\lib\site-packages\pandas\core\indexing.py in _validate_read_indexer(self, key, indexer, axis, raise_missing)
1302 if raise_missing:
1303 not_found = list(set(key) - set(ax))
-> 1304 raise KeyError(f"{not_found} not in index")
1305
1306 # we skip the warning on Categorical
KeyError: "['Pets', 'Month_value'] not in index"
Resolved: After using clumsy approach I found the errors in the mismatch spelling of two variables: 'Pets' & 'Month_values' in model file and module file. Now issue resolved by correcting the spelling in both files.
Hi Sumeera!
Thanks for reaching out.
Please accept my apologies for the delayed response.
Also, thank you very much for pointing this out! It seems the error indeed stems from using 'Pets' in plural in the module.
This is something we need to correct. Thank you.
Best,
Martin
model.load_and_clean_data('Absenteeism_new_data.csv')
After running the code above, I go the error below
Output exceeds the size limit. Open the full output data in a text editor
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[3], line 1
----> 1 model.load_and_clean_data('Absenteeism_new_data.csv')
File ~/Desktop/ABSINTEEISM_DETECTION_MODEL/Absenteeism_Module.py:128, in absenteeism_model.load_and_clean_data(self, data_file)
125 self.preprocessed_data = df.copy()
127 # we need this line so we can use it in the next functions
--> 128 self.data = self.scaler.transform(df)
File ~/miniconda3/envs/myenv/lib/python3.10/site-packages/sklearn/utils/_set_output.py:142, in _wrap_method_output.<locals>.wrapped(self, X, *args, **kwargs)
140 @wraps(f)
141 def wrapped(self, X, *args, **kwargs):
--> 142 data_to_wrap = f(self, X, *args, **kwargs)
143 if isinstance(data_to_wrap, tuple):
144 # only wrap the first output for cross decomposition
145 return (
146 _wrap_data_with_container(method, data_to_wrap[0], X, self),
147 *data_to_wrap[1:],
148 )
File ~/Desktop/ABSINTEEISM_DETECTION_MODEL/Absenteeism_Module.py:31, in CustomScaler.transform(self, X, y, copy)
29 def transform(self, X, y=None, copy=None):
30 init_col_order = X.columns
---> 31 X_scaled = pd.DataFrame(self.scaler.transform(X[self.columns]), columns=self.columns)
...
6173 raise KeyError(f"None of [{key}] are in the [{axis_name}]")
6175 not_found = list(ensure_index(key)[missing_mask.nonzero()[0]].unique())
-> 6176 raise KeyError(f"{not_found} not in index")
KeyError: "['Pets'] not in index"
I have changed the Pets column in the csv file to Pet but the error is still showing.
Please help as I am stuck in this error
Hi Victory!
Thanks for reaching out.
Can you please change 'Pet' to 'Pets' in the absenteeism_module.py file and retry? Thank you.
Hope this helps.
Kind regards,
Martin