The 365 Data Science team is proud to invite you to our own community forum. A very well built system to support your queries, questions and give the chance to show your knowledge and help others in their path of becoming Data Science specialists.
Ask
Anybody can ask a question
Answer
Anybody can answer
Vote
The best answers are voted up and moderated by our team

Get Weights

Get Weights

0
Votes
1
Answer

Hello,
First of all I would like to thank you for your robust content that really helped me.
Secondly, Concerning the business case example, after running the code, I tried getting the weights for the first hidden layer, I know it’s very big tensor [10*50] ,but the problem is each time I run the code I get a completely different bunch of weights. At the beginning, I thought that weights should converge over iterations like the minimal example till they are held constant which means there is only one possible array of weights for each example, but I guess this is not the case for non linear models because it seems there are infinite number of different weights matrix that can actually lead to a decent test accuracy. Am I right?. Should I focus on validation and test accuracy regardless of the weights obtained? Should different people get different weights for same case and they are all acceptable?
I am asking this because I spent a lot of time trying to obtain same weight matrix as some online research paper, but I couldn’t.
So sorry for the inconvenience and thank you in advance.
Kind Regards.

1 Answer

Top Answer

365 Team
0
Votes

Hi Hady,
It is completely normal to experience this.
The main reason is that each time you run the code, all weights are randomly initialized. Moreover, the hidden layers are NOT fixed values. So the hidden layer itself will be varying every time you run the code. Overall, they converge to a mathematically identical outcome (yet not following the exact same path).
For simpler models like linear regression (NN with no hidden layers) we have a deterministic mathematical solution, which is easy to achieve with no gradient descent, thus usually each new run of the model yields the same result.
Hope this helps!
Best,
Iliya 

×
LAST CHANCE
Ready to Learn Data Science?
50% OFF