Gradient of the loss function with respect to weights
Thank you very much for this great lecture. Although I understood the most of it, one question remained. What steps of derivation were taken to get from Wi+1 = Wi - n ∇W L(y,t)
to Wi+1 = Wi - n ∑ Xi
δi
If describing the entire process will be troublesome for you for some reason, could you please refer me to some online resources?
Thank you.
Hello, it is late but let me explain. You should check out the "partial derivative" concept. ∇W means that it depends on w. So when we take partial derivative of it, w is assumed as only variable and the others as constant. Therefore, you get: 1/2 * 2 * (xi * w + b)^(2-1) * xi ; when you take derivative of it.
Great content, but I have the same doubt. I understand why the derivative is what it is, but I don't get where does the sum comes from. Thanks in advance