Publications
* indicates equal contribution.
[EMNLP 2025 main] Speech Vecalign: an Embedding-based Method for Aligning Parallel Speech Documents. Chutong Meng, Philipp Koehn. [code]
[Interspeech 2025] The ML-SUPERB 2.0 challenge: Towards inclusive ASR benchmarking for all language varieties. William Chen, Chutong Meng, Jiatong Shi, Martijn Bartelds, Shih-Heng Wang, Hsiu-Hsuan Wang, Rafael Mosquera, Sara Hincapie, Dan Jurafsky, Antonis Anastasopoulos, Hung-yi Lee, Karen Livescu, Shinji Watanabe.
[IWSLT 2025] GMU Systems for the IWSLT 2025 Low-Resource Speech Translation Shared Task. Chutong Meng, Antonios Anastasopoulos. [code]
[ACL 2024 main] RepCodec: A Speech Representation Codec for Speech Tokenization. Zhichao Huang*, Chutong Meng*, Tom Ko. [code]
[TASLP 2024] Wavcaps: A chatgpt-assisted weakly-labelled audio captioning dataset for audio-language multimodal research. Xinhao Mei, Chutong Meng, Haohe Liu, Qiuqiang Kong, Tom Ko, Chengqi Zhao, Mark D. Plumbley, Yuexian Zou, Wenwu Wang. [dataset]
[Interspeech 2023] CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning. Chutong Meng*, Junyi Ao*, Tom Ko, Mingxuan Wang, Haizhou Li. [code]
[Interspeech 2023] Gigast: A 10,000-hour pseudo speech translation corpus. Rong Ye, Chengqi Zhao, Tom Ko, Chutong Meng, Tao Wang, Mingxuan Wang, Jun Cao.