The modern society demonstrates clear mutual influence among languages, e.g., Mandarin to minor languages in China, and English to other languages in the world. This leads to a clear mixlingual phenomenon, i.e., some words of a foreign (or target, embedded) language are embedded in a host (or source, matrix) language. This mixlingual phenomenon results in a serious problem in speech recognition (ASR).
Based on the success of the first MixASR-CHEN 2016 challenge, the MixASR-CHEN 2017 challenge follows the same theme of Mixlingual ASR. The task is more challenging in several ways:
- There are more utterances that involve multiple English words in the test data
- There are more English words that are not in the CMU dictionary
- There are more English phrases that involve multiple English words
- The database information is here.
- The challenge plan is here.
- The tools that can be used are here.
- The Kaldi baseline will be soon available here.
- Registration and result submission here.
- Participants from both academy and industry are welcome.
- As the first challenge, we will base the challenge on the OCOCOSDA forum and will release the results on OCOCOSDA 2017. Challenge participants are highly recommended to submit their system as a paper to the conference.
- May 15th: training/dev dataset release
- July 15: OC16-CE80 test data release
- July 18: OC16-CE80 result submission
- July 29: Paper submission deadline
- OC2017: challenge result release
- Dong Wang (Tsinghua University)
- Zhiyuan Tang (Tsinghua University)
- Qing Chen (Speech Ocean), email@example.com