1.Learning the knowledge of language model
2.Segment the training/test sentences with the 150k lexicon; Get the ppl of the
test: /nfs/home/zhangzhiyong/work/train_470h/test/huawei_disanpi.txt. Using the following LM:
3.Build the new LM using the lexicon with the keywords involved; Re-segment the test files, and test the PPL.
1.To extract sentences of the related field from the original corpus.