ValueError: X has 11 features, but StandardScaler is expecting 14 features as input.
when applying model.load_and_clean_data : i get the following error
----------------------------------
ValueError Traceback (most recent call last)
<ipython-input-6-7ce9ffbee42e> in <module>
----> 1 model.load_and_clean_data("C:/Users/pc/Desktop/PAYTHON SQL/Absenteeism_new_data.csv")
~\Desktop\PAYTHON SQL\Absenteeism_module.py in load_and_clean_data(self, data_file)
127
128 # we need this line so we can use it in the next functions
--> 129 self.data = self.scaler.transform(df)
130
131 # a function which outputs the probability of a data point to be 1
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing\_data.py in transform(self, X, copy)
881
882 copy = copy if copy is not None else self.copy
--> 883 X = self._validate_data(X, reset=False,
884 accept_sparse='csr', copy=copy,
885 estimator=self, dtype=FLOAT_DTYPES,
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
435
436 if check_params.get('ensure_2d', True):
--> 437 self._check_n_features(X, reset=reset)
438
439 return out
C:\ProgramData\Anaconda3\lib\site-packages\sklearn\base.py in _check_n_features(self, X, reset)
363
364 if n_features != self.n_features_in_:
--> 365 raise ValueError(
366 f"X has {n_features} features, but {self.__class__.__name__} "
367 f"is expecting {self.n_features_in_} features as input.")
ValueError: X has 11 features, but StandardScaler is expecting 14 features as input.
Solved :
don't work with downloaded files(scaler file and model file) , you just need to work with 'scaler file' and "model file "that you generate using your pc cz working with results of diffrent versions may produce errors.
Hi Azzaz!
Thanks for reaching out and sharing the solution of this problem.
Indeed, in this exercise, we use several Python libraries and their different versions may sometimes lead to incompatibility.
Thank you once again and please don't hesitate to post another question should you encounter other difficulties.
Hope this helps.
Best,
Martin
So please what's the solution to this issue?
Hi Stephen!
Thanks for reaching out!
Basically, working with version of Python and versions of its libraries that are incompatible leads to such errors. To avoid them, please use the versions and steps taken of the solution as shown in the videos.
In other words, should you encounter such type of error messages, as Azzaz suggests, it would be better to create the 'model' and 'scaler' files as shown in the videos, not to download the versions we provide as resources in the course (Why - because the once we provide may work for those who use versions of Python and its libraries as the ones we've used for the recording of the videos; but should you use different versions, there's no shortcut - you can follow the videos and create the two files).
Hope this helps.
Kind regards,
Martin
But how do you create a model and a scalar file? Can you be more specific
Hi Alex!
Thanks for reaching out.
The answer lies in the explanations provided in the following lecture from the previous section from the course:
https://learn.365datascience.com/courses/sql-tableau-python/saving-thelogistic-regression-model/
It reflects the work done in the entire section 4 from the course.
We do advise you to follow the order in which the lectures in a course have been presented.
Hope this helps but please feel free to get back to us should you need further assistance. Thank you.
Best,
Martin
Hi I am coming here from the Business intelligence analyst course 2022 and I am not sure if I missed something but I am sure the previous section simply mentioned data preprocessing with Python, and even earlier, introductions to various Python default library stuff. There is absolutely nothing mentioned about creating my own model and scaler files and now I am unable to proceed with the course.
Any help would be appreciated thanks.
Hi Bryan!
Thanks for reaching out.
Have you completed Section 4 from the course? It is about creating an ML model and then saving it.
https://learn.365datascience.com/courses/sql-tableau-python/exploring-the-problem-from-a-machine-learning-point-of-view/
Please complete this section and feel free to get back to us should you need further assistance. Thank you.
Kind regards,
Martin
@martin Ganchev, this doesn't work for me either, it code was already broke, due of ModuleNotFoundError: No module named 'sklearn.linear_model.logistic' - it is not available anymore in Sci-kit version 1.2.1 ( i tried to upgrade it, doesn't work and downgrading either. I work in python 3.9.
SQL + Tableau + Python | 365 Data Science
model = absenteeism_model('model', 'scaler')
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[22], line 1
----> 1 absenteeism_scaler = CustomScaler(columns_to_scale)
Cell In[19], line 7, in CustomScaler.__init__(self, columns, copy, with_mean, with_std)
6 def __init__(self,columns,copy=True,with_mean=True,with_std=True):
----> 7 self.scaler = StandardScaler(copy,with_mean,with_std)
8 self.columns = columns
9 self.mean_ = None
TypeError: StandardScaler.__init__() takes 1 positional argument but 4 were given
'Solved all the problems, code it twice on another device, but now, I got another error,
model.load_and_clean_data('Absenteeism_new_data.csv')
KeyError: "['Day of the Week'] not in index"
Hi Rosaline!
Thanks for reaching out!
Due to the length of the exercise, we know students sometimes stumble upon the error message you obtain in case they have run/rewrite a piece of code from the earlier section(s) several times. Therefore, please re-run the entire course code after restarting the kernel or retry with the files provided in Dropbox. In the case, you can verify if what you had obtained before coincides with the code you provided completely.
Hope this helps.
Best,
Martin