Does the dropout happen on a single layer or on all the hidden layers?
Good morning. I'd like to know if I have well understood: with a dropout of 0.5, we take away randomly 50% of all the neurons of all the hidden layers? Are also included the input layer or the output layer? And if all layers are involved, in what sense we double the number of neurons of the next layer? Or maybe we double the weights of all the weights that have left? Thank you
The dropout layer removes the neurons on just the last layer before it - you can think of it as "temporarily severing" part of the connection between the previous and next layer.
Hope this helps!
Nikola, 365 Team
Thanks. It still remains a part of the questions: in what sense, when we use the dropout, we double the number of neurons of the next layer? We could have for example this schema: 2 input neurons, 4 hidden neurons, dropout at 50%, 3 output neurons. With the dropout we could "switch off" the hidden neurons 1 and 4. The first epoch ends, then the new dropout could switch off the hidden neurons 3 and 4, the training continues, and so on (actually I don't know if the dropout chooses 50% of the previous layer at the end of every epoch or at the end of every loading of a batch, which of the 2?). Now, we "double the number of neurons of the next layer"? What does it mean? Here it would mean that the output neurons become 6 instead of 3. What's the point of this? Maybe this is true just if the neurons to double are in a hidden layer? Thank you