From cslt Wiki
Jump to: navigation, search

Mixlingual Speech Processing and Chinese-English MixASR Challenge

  • Organizers: Dong Wang(Tsinghua Univ.), Qing Chen(Speech Ocean)
  • Email:;


The modern society demonstrates clear mutual influence among languages, e.g., Mandarin to language minor languages in China, and English to other languages in the world. This leads to a clear mixlingual phenomenon, i.e., some words of a foreign (or target, embedded) language are embedded in a host (or source, matrix) language. This mixlingual effect causes significant problems in various speech processing tasks. This special session focuses on recent research on mixlingual speech processing from a broad range of disciplines, including but not limited to speech recognition, speech synthesis, speech analysis, spoken understanding. Particularly, this special session calls for a mixlingual ASR challenge, for which we offer a large Chinese-English mixlingual speech database OC16-CE80 (provided by Speechocean) that involves 80h of speech data and the associated resources.


This special session is expected to attract papers on recent research progress in the area of mixlingual speech processing. The targeted research topics are, but not limited to, the following:

  •  Mixlingual phonetic and phonological analysis
  •  Mixlingual speech recognition
  •  Mixlingual speech synthesis
  •  Language turn detection
  •  Mixlingual language understanding

Chinese-English Mixlingual ASR (MixASR-CHEN) Challenge

The OC16-CE80 database involves 80h of Chinese-English mixlingual data, where English words are embedded in the host Chinese sentences. This special session call for a Chinese-English MixASR challenge based on this database. The data will be free to institutes who (1) participate the MixASR-CHEN challenge; (2) participate this special session and require data to evaluate their research.