Super learner
This user is a Super Learner. To become a Super Learner, you need to reach Level 8.
Last answered:

29 Dec 2022

Posted on:

13 Dec 2022


Error in implementing the Isolation Forest Method

Tried using the isolation forest method on the popular IBM employee attrition dataset and kept getting this error. Kindly help pls.

Code :>>
from sklearn.ensemble import IsolationForest

features = data.columns

X = data[features]
X_train = X[:1000]
X_test = X[1000:]

#Fit Model
clf = IsolationForest(n_estimators=50, max_samples=100)

#Get Scores
data['scores'] = clf.decision_function(X_train)
data['anomaly'] = clf.predict(X)

#Get Anomalies
outliers = data.loc[data['anomaly'] == -1]



ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp\ipykernel_13032\ in <module>
     19 #Get Scores
---> 20 data['scores'] = clf.decision_function(X_train)
     21 data['anomaly'] = clf.predict(X)

~\anaconda3\lib\site-packages\pandas\core\ in __setitem__(self, key, value)
   3653         else:
   3654             # set column
-> 3655             self._set_item(key, value)
   3657     def _setitem_slice(self, key: slice, value):

~\anaconda3\lib\site-packages\pandas\core\ in _set_item(self, key, value)
   3830         ensure homogeneity.
   3831         """
-> 3832         value = self._sanitize_column(value)
   3834         if (

~\anaconda3\lib\site-packages\pandas\core\ in _sanitize_column(self, value)
   4537         if is_list_like(value):
-> 4538             com.require_length_match(value, self.index)
   4539         return sanitize_array(value, self.index, copy=True, allow_2d=True)

~\anaconda3\lib\site-packages\pandas\core\ in require_length_match(data, index)
    555     """
    556     if len(data) != len(index):
--> 557         raise ValueError(
    558             "Length of values "
    559             f"({len(data)}) "

ValueError: Length of values (1000) does not match length of index (1470)
2 answers ( 0 marked as helpful)
Posted on:

28 Dec 2022


hey! it looks like you're applying this line of code:

`data['scores'] = clf.decision_function(X_train)` 

on your X_train, where your `data` variable is all the data. make sure the X_train and data are the same size.

Super learner
This user is a Super Learner. To become a Super Learner, you need to reach Level 8.
Posted on:

29 Dec 2022


Ohh!  That makes more sense now. Will make corrections and give a feedback. Thanks.

Submit an answer