Difference between revisions of "NLP Status Report 2017-5-31"

From cslt Wiki
Jump to: navigation, search
 
(8 intermediate revisions by 3 users not shown)
Line 2: Line 2:
 
!Date !! People !! Last Week !! This Week
 
!Date !! People !! Last Week !! This Week
 
|-
 
|-
| rowspan="6"|2017/5/22
+
| rowspan="6"|2017/5/31
 
|Jiyuan Zhang ||
 
|Jiyuan Zhang ||
 
||  
 
||  
 
|-
 
|-
 
|Aodong LI ||
 
|Aodong LI ||
 
+
* code double-attention model with '''final_attn = alpha * attn_ch + beta * attn_en'''
 +
* baseline bleu = '''43.87'''
 +
* experiments with '''random''' initialized embedding:
 +
{| class="wikitable"
 +
|-
 +
! alpha
 +
! beta
 +
! result (bleu)
 +
|-
 +
| 1
 +
| 1
 +
| 43.50
 +
|-
 +
| 4/3
 +
| 2/3
 +
| 43.58 (w/o retrained)
 +
|-
 +
| 2/3
 +
| 4/3
 +
| 41.22 (w/o retrained)
 +
|-
 +
| 2/3
 +
| 4/3
 +
| 42.36 (w/ retrained)
 +
|}
 +
* experiments with '''constant''' initialized embedding:
 +
{| class="wikitable"
 +
|-
 +
! alpha
 +
! beta
 +
! result (bleu)
 +
|-
 +
| 1
 +
| 1
 +
| '''45.41'''
 +
|-
 +
| 4/3
 +
| 2/3
 +
| '''45.79'''
 +
|-
 +
| 2/3
 +
| 4/3
 +
| '''45.32'''
 +
|}
 +
* 1.4~1.9 BLEU score improvement
 +
* This model is similar to multi-source neural translation but uses less resource
 
||
 
||
 
+
* Test the model on big data
 +
* Explore different attention merge strategies
 +
* Explore hierarchical model
 
|-
 
|-
 
|Shiyue Zhang ||  
 
|Shiyue Zhang ||  
Line 32: Line 79:
 
| 4426
 
| 4426
 
|}
 
|}
 +
* m-nmt is running
 
||
 
||
 
+
* get word2vec on big data, and compare with word2vec from train data
 +
* test m-nmt model, increase vocab size and test
 +
* review zh-uy/uy-zh related works, start to write paper
 
|-
 
|-
 
|Shipan Ren ||
 
|Shipan Ren ||
 
+
* writed document of tf_translate project
 +
* read neural machine translation paper
 +
* read tf_translate code
 +
* run and tested tf_translate code
 
||
 
||
  

Latest revision as of 09:00, 31 May 2017

Date People Last Week This Week
2017/5/31 Jiyuan Zhang
Aodong LI
  • code double-attention model with final_attn = alpha * attn_ch + beta * attn_en
  • baseline bleu = 43.87
  • experiments with random initialized embedding:
alpha beta result (bleu)
1 1 43.50
4/3 2/3 43.58 (w/o retrained)
2/3 4/3 41.22 (w/o retrained)
2/3 4/3 42.36 (w/ retrained)
  • experiments with constant initialized embedding:
alpha beta result (bleu)
1 1 45.41
4/3 2/3 45.79
2/3 4/3 45.32
  • 1.4~1.9 BLEU score improvement
  • This model is similar to multi-source neural translation but uses less resource
  • Test the model on big data
  • Explore different attention merge strategies
  • Explore hierarchical model
Shiyue Zhang
  • found dropout bug, fix it, and reran baseline: baseline 35.21, baseline(outproj=emb) 35.24
  • tried several embed set models, failed
  • embedded other words to model embedding space (trained on train data not big data), and then directly used in baseline(outproj=emb)
30000 50000 70000 90000
35.24 34.52 33.73 33.16
4564 (6666) 4535 4469 4426
  • m-nmt is running
  • get word2vec on big data, and compare with word2vec from train data
  • test m-nmt model, increase vocab size and test
  • review zh-uy/uy-zh related works, start to write paper
Shipan Ren
  • writed document of tf_translate project
  • read neural machine translation paper
  • read tf_translate code
  • run and tested tf_translate code