Question about the loss of Masked LM #49

zhezhaoa · 2018-12-07T12:07:29Z

Thank you very much for this great contribution.
I found the loss of masked LM didn't decrease when it reaches the value around 7. However, in the official tensorflow implementation, the loss of MLM decreases to 1 easily. I think something went wrong in your implementation.
In additional, I found the code can not predict the next sentence correctly. I think the reason is: self.criterion = nn.NLLLoss(ignore_index=0). It can not be used as criterion for sentence prediction because the label of sentence is 1 or 0. We should remove ignore_index=0 for sentence prediction.
I am looking forward to your reply~

The text was updated successfully, but these errors were encountered:

tanaka-jp · 2018-12-14T10:48:06Z

I think the reason is: self.criterion = nn.NLLLoss(ignore_index=0). It can not be used as criterion for sentence prediction because the label of sentence is 1 or 0.

I think you are right.
My loss of next sentence is very low, but the acc of next_correct is always near 50%.

raulpuric · 2019-01-25T00:19:09Z

I've been trying to repro BERT's pretraining results from scratch in my own time, and I have been unable to train beyond an masked LM loss of 5.4. So if anyone is able to get past this point I'd love to learn what you did.

codertimo · 2019-04-08T13:26:43Z

Sorry for my late update, and I think your point is right too. I'll fix it up ASAP

itamargol · 2019-05-05T12:39:35Z

What is the verdict here regarding next sentence task?
Should we use 2 different loss function, without ignore=0 for sentence prediction?

And what about the MLM? anyone found a solution? Can't drop also beneath 6/7...

codertimo added the good first issue label Apr 8, 2019

codertimo / BERT-pytorch Public

Question about the loss of Masked LM #49

Question about the loss of Masked LM #49

zhezhaoa commented Dec 7, 2018

tanaka-jp commented Dec 14, 2018

raulpuric commented Jan 25, 2019

codertimo commented Apr 8, 2019

itamargol commented May 5, 2019 •

edited

codertimo / BERT-pytorch Public

Question about the loss of Masked LM #49

Question about the loss of Masked LM #49

Comments

zhezhaoa commented Dec 7, 2018

tanaka-jp commented Dec 14, 2018

raulpuric commented Jan 25, 2019

codertimo commented Apr 8, 2019

itamargol commented May 5, 2019 • edited

itamargol commented May 5, 2019 •

edited