Difference between revisions of "NLP Status Report 2017-6-5"

From cslt Wiki
Jump to: navigation, search
 
(4 intermediate revisions by 2 users not shown)
Line 12: Line 12:
 
   Share the attention mechanism and then directly add them -- 46.20
 
   Share the attention mechanism and then directly add them -- 46.20
 
* big data baseline bleu = '''30.83'''
 
* big data baseline bleu = '''30.83'''
* Fixed three embeddings
+
* Model with three fixed embeddings
 
   Shrink output vocab from 30000 to 20000 and best result is 31.53
 
   Shrink output vocab from 30000 to 20000 and best result is 31.53
 
   Train the model with 40 batch size and best result until now is 30.63
 
   Train the model with 40 batch size and best result until now is 30.63
Line 21: Line 21:
 
|-
 
|-
 
|Shiyue Zhang ||  
 
|Shiyue Zhang ||  
 
+
* trained word2vec on big data, and directly used it on NMT, but resulted in quite poor performance
 +
* trained M-NMT model, got bleu=36.58 (+1.34 than NMT). But found the EOS in mem has a big influence on result:
 +
{| class="wikitable"
 +
|-
 +
! NMT
 +
! 35.24, 57.7/39.8/31.9/27.0 BP=0.939
 +
|-
 +
|MNMT (EOS=1)
 +
| 35.27, 60.0/41.3/33.1/28.0 BP=0.907
 +
|-
 +
| MNMT (EOS=0.2)
 +
| 36.40, 59.1/40.8/32.6/27.4 BP=0.951
 +
|-
 +
| MNMT (EOS=0)
 +
| 36.58, 58.4/40.4/32.1/27.0 BP=0.968
 +
|}
 +
* tried to tackle UNK using 36.58 M-NMT,  increased vocab to 50000, got bleu=35.63, 58.6/40.0/31.6/26.4 BP=0.953 (not good, ?)
 +
* training uy-zh, 50% zh-uy, 25% zh-uy
 +
* training mem without EOS
 +
* reviewing related papers
 
||
 
||
 
+
* solve EOS problem
 +
* find way to tackle UNK
 +
* write paper
 
|-
 
|-
 
|Shipan Ren ||
 
|Shipan Ren ||

Latest revision as of 06:04, 5 June 2017

Date People Last Week This Week
2017/6/5 Jiyuan Zhang
Aodong LI
  • Small data:
 Only make the English encoder's embedding constant -- 45.98
 Only initialize the English encoder's embedding and then finetune it -- 46.06
 Share the attention mechanism and then directly add them -- 46.20
  • big data baseline bleu = 30.83
  • Model with three fixed embeddings
 Shrink output vocab from 30000 to 20000 and best result is 31.53
 Train the model with 40 batch size and best result until now is 30.63
  • test more checkpoints on model trained with batch = 40
  • train model with reverse output
Shiyue Zhang
  • trained word2vec on big data, and directly used it on NMT, but resulted in quite poor performance
  • trained M-NMT model, got bleu=36.58 (+1.34 than NMT). But found the EOS in mem has a big influence on result:
NMT 35.24, 57.7/39.8/31.9/27.0 BP=0.939
MNMT (EOS=1) 35.27, 60.0/41.3/33.1/28.0 BP=0.907
MNMT (EOS=0.2) 36.40, 59.1/40.8/32.6/27.4 BP=0.951
MNMT (EOS=0) 36.58, 58.4/40.4/32.1/27.0 BP=0.968
  • tried to tackle UNK using 36.58 M-NMT, increased vocab to 50000, got bleu=35.63, 58.6/40.0/31.6/26.4 BP=0.953 (not good, ?)
  • training uy-zh, 50% zh-uy, 25% zh-uy
  • training mem without EOS
  • reviewing related papers
  • solve EOS problem
  • find way to tackle UNK
  • write paper
Shipan Ren