Your model doesn't appear to be balanced?
When executing:
y_test_pred = clf.predict(x_test_transf)
I get this error:
ValueError: X has 1646 features, but MultinomialNB is expecting 3810 features as input.
Which appears to mean that the Multinominal is expecting a shape for the model that is different from what the shape of the model is. In other words, it's not balanced.
As much as I would like to try to balance the model myself. I think I would do more harm than good so if you could either balance the model or tell me where I'm going wrong it would be most appreciated.
Thanks..
Hey Timothy,
Thank you for your question!
This error might be caused by something that you have done up in the code. Unfortunately, I cannot reproduce the error you are getting and cannot help you with a solution. What I can suggest is to restart the notebook from Kernel -> Restart and run everything anew. Note that it is important to execute the cells in order.
Kind regards,
365 Hristina
So I did get your notebook to work finally. I'm going to have to go over what I did. Thank you so much for the confirmation that I was doing something wrong.
Hello, I think the problem is that you used vectoriser.fit_transform(x_test) on the x_test variable, instead of just vectoriser.transform(). The x_test variable has imbalanced features because you fit the data on the vectoriser, and then you transform it. Using only transform makes it so that the x_test variable is transformed by the vectoriser that is fit on the x_train variable
I am getting the error below, not sure what i am doing wrong
AttributeError Traceback (most recent call last)
Cell In[93], line 1
----> 1 ConfusionMatrixDisplay.from_predictions(
2 y_test, y_test_pred,
3 labels = clf.classes_,
4 cmap = 'magma');
AttributeError: type object 'ConfusionMatrixDisplay' has no attribute 'from_predictions'
Hey Erick,
Thank you for reaching out!
Please, update your scikit-learn
library to version 1.0 or higher. This will solve the issue.
Kind regards,
365 Hristina