From cslt Wiki
Jump to: navigation, search

Resoruce Building

  • Current text resource has been re-arranged and listed

Leftover questions

  • Asymmetric window: Great improvement on training set(WER 34% to 24%), however the improvement is lost on test. Overfitting?
  • Multi GPU training: Error encountered
  • Multilanguage training
  • Investigating LOUDS FST.
  • CLG embedded decoder plus online compiler.

AM development

Sparse DNN

  • GA-based block sparsity
  • code ready, testing on pure matrix multiplication

GMM/DNN co-training

  • Co-training using Tencent data
  • slightly better in GMM modeling when using DNN alignment
  • worse performance when using the re-trained GMMs

Noise training

  • Single noise injection
  • Multi noise injection

AMR compression re-training

  • 1700h AMR training on going


  • gfbank is better than gfcc
  • gfbank is better than fbank
  • gfbank + fbank seems outperforms others

Word to Vector

  • Data preparation
  • Prepared 7 category totally 500+ articles
  • Prepared Sogou 9-class text, totally 9*2000 articles
  • Achieved Fudan 11-class text data, only for testing
  • Improved wordvector with multi sense
  • Almost impossible with the toolkit
  • Can think of pre-training vectors and then do clusering
  • WordVecteor-based keyword extraction
  • Decide to use the Sogou data to do extraction
  • Evaluate the keyword in the classification task
  • Wordvector based on classification
  • Decide to use the Sogou data to do extraction

LM development


  • Character-based NNLM (6700 chars, 7gram), 500M data training done.
  • boundary-involved char NNLM training done
  • Test on going
  • Investigate MS RNN LM training

Pronunciation scoring

  • G-score done on 16k English model
  • The distribution of frames over phone/frame posterior scores seem highly discriminative
  • The distribution of the distance of the test utterance against the reference utterance seems a high discriminative score


FST-based matching

  • Code done. Simple test done
  • Ready for large scale test

Speech QA

  • Class LM QA
  • Now find that with smaller weight to the class FST, better performance is obtained
  • Now it is very difficult to retrieve the words that can not be found by the original FST
  • Test negative weights