It’s official - I’ve made tenure! I’m now an Associate Professor in Psychology at Nanyang Technological University Singapore!🎉
This wouldn’t have been possible without my awesome mentors, my academic support sisters 💪💪💪, the wonderful wonderful junior collaborators in my lab - past and present - and of course, my anonymous letter writers 🙏
Baseline system + leaderboards are up for #MerlionChallenge untangling complex code-mixed speech. Which #ML #DeepLearning #SpeechProc system will do the best job on complex language use in the wild? 👀
TWO TEAMS have already beaten the baseline for Language ID:
🎉Lingua_Lumos (Closed)
🎉UNSW_Signal_Processing (Open)
There’s still time to join the challenge and prep your paper for our special session at #Interspeech2023
Have you ever seen auto-generated subtitles turn to mush because they couldn’t handle a speaker’s accent or figure out what language they’re speaking after a switch? The #MerlionChallenge for #Interspeech23 tests how well teams can build a language detection system for Code-Switching in >300 Zoom recordings. Help build robust systems for multilingualism by joining the challenge or sharing with #ML #DeepLearning #SpeechProc friends 💪🏼💪🏽💪🏿 https://toot.community/@suzyjstyles/109713790862725145
We provide:
👉Training data (closed track)
👉Dev audio + ground truth
👉Eval audio w/ reduced annotations
👉CodaLab submission and scoring
👉FOUR live Leaderboards
👉Leaderboard Chat 😏
You can add:
👉Up to 100hr training data (open track)
For the #MerlionChallenge at #Interspeech we’ll be asking teams to train a #SpeechProc / #AI system that can guess which language is which (Task 1: Language ID) and when (Task 2: Language Diarization)!
👉Challenge audio is Zoom recordings with English and/or Mandarin Chinese
👉Audio for development matches audio for evaluation 😗👌
Our annotation protocol is documented in the BELA transcription conventions. The Wiki includes instructions for how to do multi-tier multilingual transcriptions using Elan (free!)
BELA Con:
blipntu.github.io/belacon/
For the #MerlionChallenge we hold some info back
I’m sure I have a bunch of #Multilingual #LangDev, #SpeechProc #NLP and #CogSci friends over here 🦣
We’ve prepped >30hrs of our English/Mandarin code-switched child directed speech for the #MerlionChallenge at this year’s INTERSPEECH
>300 files, >100 voices 🙀 (+ training data)
We’re looking for speech systems that can figure out which language is spoken when!
The #MerlionChallenge will see whose system does the best job 💪🏼
Join or help us boost the message: https://sites.google.com/view/merlion-ccs-challenge/
Swapped 29°C Singapore for 9°C Berlin (That’s 84F and 48F for you Americanists 😅) Same humidity though - 89%!!!
Blue skies and light fog. CRISP 🥶