140915 - Xiaoxi Wang
Fixed a minor bug in lm v2.1.4 & v2.1.5
Extracted corpora from baiduzhidao using cross entropy and trained a lm
build a small (90k) vocab for lm, the performance will be tested later (some tasks are assigned to Zhang Chengcheng)
Also a new lm with more training data from baiduzhidao is ready for test.
Fix the bug that lm v2.1.x cannot decode most of English words (e.g. wifi, modem) correctly.
Add weibo data to training corpora and train more LMs