Scaling & activation function

Question

Hello, Should the scaling method and the activation function of the output layer be compatible? For instance, when I scale the data using (x-xmin)/(xmax-xmin), should I necessarily use sigmoid in the output layer? so that both the target and the output have same range [0,1] hence, they can be compared. Does that make sense? What if I use normal standarization [(x-mean)/std], should I necessarily use a specific activation function? Thank you in advance.

Answer 1

Hi Hady,
For neural networks that is not a prerequisite. Therefore, the two need not have the same natural domain.
Best,
Iliya

Scaling & activation function

Submit an answer