Difference between revisions of "140428-Xiaoxi Wang"

From cslt Wiki
Jump to: navigation, search
(以内容“This week: preprocessed the baiduzhidao and part of weibo data. wrote a Hanzi2Num tool sampled corpora from weibo and baiduzhidao (4.4G) and grabbed the keywords from ...”创建新页面)
 
 
Line 2: Line 2:
  
 
preprocessed the baiduzhidao and part of weibo data.
 
preprocessed the baiduzhidao and part of weibo data.
 +
 
wrote a Hanzi2Num tool
 
wrote a Hanzi2Num tool
 +
 
sampled corpora from weibo and baiduzhidao (4.4G) and grabbed the keywords from them
 
sampled corpora from weibo and baiduzhidao (4.4G) and grabbed the keywords from them
 +
 
classified corpora according to keywords.
 
classified corpora according to keywords.
 +
  
 
Next week:
 
Next week:
 +
 
Train and evaluate lm from classified corpora
 
Train and evaluate lm from classified corpora
 +
 
make improves on algorithms
 
make improves on algorithms

Latest revision as of 09:56, 28 April 2014

This week:

preprocessed the baiduzhidao and part of weibo data.

wrote a Hanzi2Num tool

sampled corpora from weibo and baiduzhidao (4.4G) and grabbed the keywords from them

classified corpora according to keywords.


Next week:

Train and evaluate lm from classified corpora

make improves on algorithms