Difference between revisions of "NLP Status Report 2017-5-31"

From cslt Wiki
Jump to: navigation, search
 
(4 intermediate revisions by 2 users not shown)
Line 26: Line 26:
 
| 2/3
 
| 2/3
 
| 4/3
 
| 4/3
| 42.22 (w/o retrained)
+
| 41.22 (w/o retrained)
 
|-
 
|-
 
| 2/3
 
| 2/3
Line 51: Line 51:
 
| '''45.32'''
 
| '''45.32'''
 
|}
 
|}
 +
* 1.4~1.9 BLEU score improvement
 
* This model is similar to multi-source neural translation but uses less resource
 
* This model is similar to multi-source neural translation but uses less resource
 
||
 
||
 +
* Test the model on big data
 
* Explore different attention merge strategies
 
* Explore different attention merge strategies
 
* Explore hierarchical model
 
* Explore hierarchical model
Line 84: Line 86:
 
|-
 
|-
 
|Shipan Ren ||
 
|Shipan Ren ||
 
+
* writed document of tf_translate project
 +
* read neural machine translation paper
 +
* read tf_translate code
 +
* run and tested tf_translate code
 
||
 
||
  

Latest revision as of 09:00, 31 May 2017

Date People Last Week This Week
2017/5/31 Jiyuan Zhang
Aodong LI
  • code double-attention model with final_attn = alpha * attn_ch + beta * attn_en
  • baseline bleu = 43.87
  • experiments with random initialized embedding:
alpha beta result (bleu)
1 1 43.50
4/3 2/3 43.58 (w/o retrained)
2/3 4/3 41.22 (w/o retrained)
2/3 4/3 42.36 (w/ retrained)
  • experiments with constant initialized embedding:
alpha beta result (bleu)
1 1 45.41
4/3 2/3 45.79
2/3 4/3 45.32
  • 1.4~1.9 BLEU score improvement
  • This model is similar to multi-source neural translation but uses less resource
  • Test the model on big data
  • Explore different attention merge strategies
  • Explore hierarchical model
Shiyue Zhang
  • found dropout bug, fix it, and reran baseline: baseline 35.21, baseline(outproj=emb) 35.24
  • tried several embed set models, failed
  • embedded other words to model embedding space (trained on train data not big data), and then directly used in baseline(outproj=emb)
30000 50000 70000 90000
35.24 34.52 33.73 33.16
4564 (6666) 4535 4469 4426
  • m-nmt is running
  • get word2vec on big data, and compare with word2vec from train data
  • test m-nmt model, increase vocab size and test
  • review zh-uy/uy-zh related works, start to write paper
Shipan Ren
  • writed document of tf_translate project
  • read neural machine translation paper
  • read tf_translate code
  • run and tested tf_translate code