ValueError: X has 11 features, but StandardScaler is expecting 14 features as input.

Question

when applying model.load_and_clean_data : i get the following error

----------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-6-7ce9ffbee42e> in <module>
----> 1 model.load_and_clean_data("C:/Users/pc/Desktop/PAYTHON SQL/Absenteeism_new_data.csv")

~\Desktop\PAYTHON SQL\Absenteeism_module.py in load_and_clean_data(self, data_file)
    127 
    128             # we need this line so we can use it in the next functions
--> 129             self.data = self.scaler.transform(df)
    130 
    131         # a function which outputs the probability of a data point to be 1

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\preprocessing\_data.py in transform(self, X, copy)
    881 
    882         copy = copy if copy is not None else self.copy
--> 883         X = self._validate_data(X, reset=False,
    884                                 accept_sparse='csr', copy=copy,
    885                                 estimator=self, dtype=FLOAT_DTYPES,

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
    435 
    436         if check_params.get('ensure_2d', True):
--> 437             self._check_n_features(X, reset=reset)
    438 
    439         return out

C:\ProgramData\Anaconda3\lib\site-packages\sklearn\base.py in _check_n_features(self, X, reset)
    363 
    364         if n_features != self.n_features_in_:
--> 365             raise ValueError(
    366                 f"X has {n_features} features, but {self.__class__.__name__} "
    367                 f"is expecting {self.n_features_in_} features as input.")

ValueError: X has 11 features, but StandardScaler is expecting 14 features as input.

Answer 1

Solved :
don't work with downloaded files(scaler file and model file) , you just need to work with 'scaler file' and "model file "that you generate using your pc cz working with results of diffrent versions may produce errors.

Answer 2

Hi Azzaz!

Thanks for reaching out and sharing the solution of this problem.

Indeed, in this exercise, we use several Python libraries and their different versions may sometimes lead to incompatibility.
Thank you once again and please don't hesitate to post another question should you encounter other difficulties.

Hope this helps.
Best,
Martin

Answer 3

Stephen 0199

Posted on:

14 Feb 2022

1

So please what's the solution to this issue?

Answer 4

Hi Stephen!

Thanks for reaching out!

Basically, working with version of Python and versions of its libraries that are incompatible leads to such errors. To avoid them, please use the versions and steps taken of the solution as shown in the videos.
In other words, should you encounter such type of error messages, as Azzaz suggests, it would be better to create the 'model' and 'scaler' files as shown in the videos, not to download the versions we provide as resources in the course (Why - because the once we provide may work for those who use versions of Python and its libraries as the ones we've used for the recording of the videos; but should you use different versions, there's no shortcut - you can follow the videos and create the two files).

Hope this helps.
Kind regards,
Martin

Answer 5

But how do you create a model and a scalar file? Can you be more specific

Answer 6

Hi Alex!

Thanks for reaching out.

The answer lies in the explanations provided in the following lecture from the previous section from the course:
https://learn.365datascience.com/courses/sql-tableau-python/saving-thelogistic-regression-model/
It reflects the work done in the entire section 4 from the course.

We do advise you to follow the order in which the lectures in a course have been presented.

Hope this helps but please feel free to get back to us should you need further assistance. Thank you.
Best,
Martin

Answer 7

Hi I am coming here from the Business intelligence analyst course 2022 and I am not sure if I missed something but I am sure the previous section simply mentioned data preprocessing with Python, and even earlier, introductions to various Python default library stuff. There is absolutely nothing mentioned about creating my own model and scaler files and now I am unable to proceed with the course.

Any help would be appreciated thanks.

Answer 8

Hi Bryan!

Thanks for reaching out.

Have you completed Section 4 from the course? It is about creating an ML model and then saving it.
https://learn.365datascience.com/courses/sql-tableau-python/exploring-the-problem-from-a-machine-learning-point-of-view/

Please complete this section and feel free to get back to us should you need further assistance. Thank you.

Kind regards,
Martin

Answer 9

@martin Ganchev, this doesn't work for me either, it code was already broke, due of ModuleNotFoundError: No module named 'sklearn.linear_model.logistic' - it is not available anymore in Sci-kit version 1.2.1 ( i tried to upgrade it, doesn't work and downgrading either. I work in python 3.9.

SQL + Tableau + Python | 365 Data Science

model = absenteeism_model('model', 'scaler')

---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[22], line 1
----> 1 absenteeism_scaler = CustomScaler(columns_to_scale)

Cell In[19], line 7, in CustomScaler.__init__(self, columns, copy, with_mean, with_std)
6 def __init__(self,columns,copy=True,with_mean=True,with_std=True):
----> 7 self.scaler = StandardScaler(copy,with_mean,with_std)
8 self.columns = columns
9 self.mean_ = None

TypeError: StandardScaler.__init__() takes 1 positional argument but 4 were given

Answer 10

'Solved all the problems, code it twice on another device, but now, I got another error,

model.load_and_clean_data('Absenteeism_new_data.csv')

KeyError: "['Day of the Week'] not in index"

Answer 11

Hi Rosaline!

Thanks for reaching out!

Due to the length of the exercise, we know students sometimes stumble upon the error message you obtain in case they have run/rewrite a piece of code from the earlier section(s) several times. Therefore, please re-run the entire course code after restarting the kernel or retry with the files provided in Dropbox. In the case, you can verify if what you had obtained before coincides with the code you provided completely.

Hope this helps.
Best,
Martin

ValueError: X has 11 features, but StandardScaler is expecting 14 features as input.

Submit an answer

related questions