Difference between revisions of "NLP Status Report 2017-7-31"

From cslt Wiki
Jump to: navigation, search
Line 14: Line 14:
 
|-
 
|-
 
|Aodong LI ||
 
|Aodong LI ||
 
+
* Got 55,000+ Englsih poems and 260,000+ lines after preprocessing
 +
* Added phase separators as the style indicator, and every line has at least one separator
 +
* Training loss didn't decrease very much, only from 440 to 50
 +
* The translation quality deteriorated when added language model
 
||
 
||
 
+
* Try to use a larger language model to decrease the training loss
 +
* Try to use character-based MT in English-Chinese translation
 
|-
 
|-
 
|Shiyue Zhang ||  
 
|Shiyue Zhang ||  

Revision as of 04:49, 31 July 2017

Date People Last Week This Week
2017/7/3 Jiyuan Zhang
  • made the poster for ACL
  • attempted to fix repeated word, but failed
  • done some work of n-gram model of the couplet
  • generate streame according to a couplet
  • complete the task of filling in the blanks of a couplet
Aodong LI
  • Got 55,000+ Englsih poems and 260,000+ lines after preprocessing
  • Added phase separators as the style indicator, and every line has at least one separator
  • Training loss didn't decrease very much, only from 440 to 50
  • The translation quality deteriorated when added language model
  • Try to use a larger language model to decrease the training loss
  • Try to use character-based MT in English-Chinese translation
Shiyue Zhang
Shipan Ren
  • trained two models of the baseline using WMT2014 en-fr datasets
 under training 


  • read some papers(memory-augmented-nmt and Memory augmented Chinese-Uyghur Neural Machine Translation)
  • read memory-augmented-nmt code
  • read papers about memory augmented NMT
Jiayu Guo
  • process document.
  • Shiji has been split up to 2,5000 pairs of sentence.
  • Zizhitongjian has been split up to 2,0000 pairs.
  • adjust jieba source code