From cslt Wiki
Revision as of 01:58, 18 October 2013 by Litc
ASR Kernel development
- full-lab training is ready. Trained the first full-lab system with 16k/pseduo 48k data.
- re-recording 48k data using F00 (500 sentences) and retrain the model. The quality of the signal sounds better, while the quality of pitch is a bit strange. Need more investigation on parameter settings.
- Check the signal parameters and solve the problem of pitch.
- Prepare the large data training with both all-F 863 data.
- Prepare the large data training with online novel.
- The search system migrated to the custom domain, with significant performance reduction
Customs: n TF TFIDF 1 0.496 0.485 2 0.619 0.615 3 0.676 0.673 4 0.713 0.715 5 0.740 0.738 Agriculture: n TF TFIDF 1 0.75 0.8 2 0.85 0.883 3 0.867 0.917 4 0.867 0.95 5 0.95 0.967
- Two problems:
- short of semantic cluster.
- limited training data for idf.
- Next week
- Analyse the QA database, to extract useful domain dependent data
- Analyse the data to expand the key words & phrases
- Analyse the data to attain better IDF.