Scaling & activation function
Hello,
Should the scaling method and the activation function of the output layer be compatible?
For instance, when I scale the data using (x-xmin)/(xmax-xmin), should I necessarily use sigmoid in the output layer? so that both the target and the output have same range [0,1] hence, they can be compared. Does that make sense?
What if I use normal standarization [(x-mean)/std], should I necessarily use a specific activation function?
Thank you in advance.
1 answers ( 0 marked as helpful)
Hi Hady,
For neural networks that is not a prerequisite. Therefore, the two need not have the same natural domain.
Best,
Iliya