Difference between revisions of "NLP Status Report 2017-5-22"
From cslt Wiki
Line 17: | Line 17: | ||
* The training process of double-decoder model '''without''' joint loss is problematic. | * The training process of double-decoder model '''without''' joint loss is problematic. | ||
|| | || | ||
− | * Overfitting? Train | + | * Overfitting? Train 2nd translator on large data |
− | * Replace the | + | * Replace the forced teaching mechanism in training process with beam search mechanism. |
|- | |- | ||
|Shiyue Zhang || | |Shiyue Zhang || |
Revision as of 06:15, 24 May 2017
Date | People | Last Week | This Week |
---|---|---|---|
2017/5/22 | Jiyuan Zhang | ||
Aodong LI |
hidden_size, emb_size, lr = 500, 310, 0.001 bleu = 43.53 (best) hidden_size, emb_size, lr = 700, 510, 0.001 bleu = 45.21 (best) but most results are under 43.1 hidden_size, emb_size, lr = 700, 510, 0.0005 bleu = 42.19 (best)
bleu = 40.11 (best) The 1st decoder's output is generally better than 2nd decoder's output.
|
| |
Shiyue Zhang |
|
| |
Shipan Ren |
|
|