Difference between revisions of "ASR:2015-12-1"

From cslt Wiki
Jump to: navigation, search
Line 31: Line 31:
 
* extract real room impulse to generate WSJ reverberation data, and then train RNN-DAE   
 
* extract real room impulse to generate WSJ reverberation data, and then train RNN-DAE   
  
===Ivector&Dvector based ASR===  
+
===Speaker recognition===  
* learning from ivector --Lantian
+
* DNN-ivector framework
:* CNN ivector learning
+
* SUSR
:* DNN ivector learning
+
* AutoEncoder + metric learning
 
* binary ivector  
 
* binary ivector  
* metric learning
+
 
* LDA-vector Transfer Learning
+
  
  

Revision as of 05:10, 7 December 2015

Speech Processing

AM development

Environment

RNN AM

  • train monophone RNN --zhiyuan


Adapative learning rate method

  • sequence training -Xiangyu

Mic-Array

  • hold
  • compute EER with kaldi

Data selection unsupervised learning

  • hold
  • acoustic feature based submodular using Pingan dataset --zhiyong
  • write code to speed up --zhiyong
  • curriculum learning --zhiyong

RNN-DAE(Deep based Auto-Encode-RNN)

  • hold
  • RNN-DAE has worse performance than DNN-DAE because training dataset is small
  • extract real room impulse to generate WSJ reverberation data, and then train RNN-DAE

Speaker recognition

  • DNN-ivector framework
  • SUSR
  • AutoEncoder + metric learning
  • binary ivector


language vector

  • write a paper--zhiyuan
  • hold
  • language vector is added to multi hidden layers--zhiyuan
  • RNN language vector
  • hold
  • train with extra input of speech rate info

multi-GPU

  • multi-stream training --Sheng Su
  • write a technique report
  • kaldi-nnet3 --Xuewei
  • train 7*2048 tdnn using 4000h data --Mengyuan
  • train mpe using wsj and aurara4 --Zhiyong,Xuewei

multi-task

  • test according to selt-information neural structure learning --mengyuan
  • hold
  • write code done
  • no significant performance improvement observed
  • speech rate learning --xiangyu
test using extreme data

Text Processing

Work

RNN Poem Process

  • Combine addition rhyme.
  • Investigate new method.

Document Represent

  • Code done. Wait some experiments result.

Seq to Seq

  • Work on some tasks.

Order representation

  • Code some idea.

Balance Representation

  • Investigate some papers.
  • Current solution : Use knowledge or large corpus's similar pair.

Hold

Neural Based Document Classification

RNN Rank Task

Graph RNN

  • Entity path embeded to entity.
  • (hold)

RNN Word Segment

  • Set bound to word segment.
  • (hold)

Recommendation

  • Reproduce baseline.
  • LDA matrix dissovle.
  • LDA (Text classification & Recommendation System) --> AAAI

RNN based QA

  • Read Source Code.
  • Attention based QA.
  • Coding.

Text Group Intern Project

Buddhist Process

  • (hold)

RNN Poem Process

  • Done by Haichao yu & Chaoyuan zuo Mentor : Tianyi Luo.

RNN Document Vector

  • (hold)

Image Baseline

  • Demo Release.
  • Paper Report.
  • Read CNN Paper.

Text Intuitive Idea

Trace Learning

  • (Hold)

Match RNN

  • (Hold)

financial group

model research

  • RNN
  • online model, update everyday
  • modify cost function and learning method
  • add more feature

rule combination

  • GA method to optimize the model

basic rule

  • classical tenth model

multiple-factor

  • add more factor
  • use sparse model

display

  • bug fixed
  • buy rule fixed

data

  • data api
  • download the future data and factor data