Last answered:

15 Nov 2023

Posted on:

12 Nov 2023

0

Resolved: Standardising the features

Previously we used sklearn's StandardScaler module to standardise our features. In this lesson we used sklearn's Preprocessing module to standardise our features. Is there a benefit to using one over the other or is it purely preference?

1 answers ( 1 marked as helpful)
Instructor
Posted on:

15 Nov 2023

2

Hey Ryan,


Thanks for reaching out.


The sklearn.preprocessing.StandardScaler class and the sklearn.preprocessing.scale function in sklearn both serve the purpose of standardizing features by removing the mean and scaling to unit variance. However, they differ in how they are used and integrated in the code.


StandardScaler is a class that implements the Transformer API - it fits to data and then transforms it. A StandardScaler object, therefore, comes with fit(), transform(), and fit_transform() methods. Among other benefits, instances of the StandardScaler class can be easily saved and loaded, allowing the scaling parameters to be reused.


On the other hand, scale() is a function that performs the standardization by directly computing the mean and standard deviation and applying the transformation. It does not maintain any internal state and, therefore, doesn't remember any state from the data it's applied to. This means each time it's used, it needs to calculate the mean and standard deviation from the given data. It does not have the fit(), transform(), and fit_transform() methods of a transformer class. It's useful for quick and straightforward scaling without the need to fit a scaler object, especially useful in simple scripts (as the one used in the lecture).


Hope this helps!


Kind regards,

365 Hristina

Submit an answer