Resolved: Standardising the features
Previously we used sklearn's StandardScaler module to standardise our features. In this lesson we used sklearn's Preprocessing module to standardise our features. Is there a benefit to using one over the other or is it purely preference?
Thanks for reaching out.
The sklearn.preprocessing.StandardScaler class and the sklearn.preprocessing.scale function in
sklearn both serve the purpose of standardizing features by removing the mean and scaling to unit variance. However, they differ in how they are used and integrated in the code.
StandardScaler is a class that implements the
Transformer API - it fits to data and then transforms it. A
StandardScaler object, therefore, comes with
fit_transform() methods. Among other benefits, instances of the
StandardScaler class can be easily saved and loaded, allowing the scaling parameters to be reused.
On the other hand,
scale() is a function that performs the standardization by directly computing the mean and standard deviation and applying the transformation. It does not maintain any internal state and, therefore, doesn't remember any state from the data it's applied to. This means each time it's used, it needs to calculate the mean and standard deviation from the given data. It does not have the
fit_transform() methods of a transformer class. It's useful for quick and straightforward scaling without the need to fit a scaler object, especially useful in simple scripts (as the one used in the lecture).
Hope this helps!