A dataset of ultrasound and audio recordings from children with speech sound disorders
The UltraPhonix dataset contains 20 speakers (16 male, 4 female), aged 6-13 years.
For a list and additional details, see UPX Speakers.
|Suit||Suitability session to determine if child needs speech therapy|
|BL||Baseline session before therapy (1-2 sessions)|
|Mid||Mid-point session, halfway through therapy|
|Post||Post-therapy session, immediately after therapy ended|
|Maint||Maintenance session, some time after therapy ended|
Core data types
|ult||raw ultrasound data|
|txt||prompt text with date/time of utterance recording|
|slt_labels||manual annotation from SLT, when available. See  for details|
|speaker_labels||speaker diarization identifying therapist (SLT) and child (CHILD) speech|
|word_labels||automatic word-level alignment|
|phone_labels||automatic phone-level alignment|
Labels are available in Praat's TextGrid format and HTK's lab format.
Speaker, word, and phone labels were generated according to the methods described in .
Individual recordings are indexed for each session according to their recording times. See the prompt text file for recording date/time.
Each file ID also includes a prompt type identifier. See Data for details.
 Eshky, A., Ribeiro, M. S., Cleland, J., Richmond, K., Roxburgh, Z., Scobbie, J., & Wrench, A. (2018) Ultrasuite: A repository of ultrasound and acoustic data from child speech therapy sessions. Proceedings of INTERSPEECH. Hyderabad, India.
 Cleland, J., Scobbie, J. M., Heyde, C., Roxburgh, Z., & Wrench, A. A. (2017). Covert contrast and covert errors in persistent velar fronting. Clinical linguistics & phonetics, 31(1), 35-55.
 Cleland, J., Scobbie, J. M., Roxburgh, Z., Heyde, C., & Wrench, A. A. (Under Revision). Enabling New Articulatory Gestures in Children with Persistent Speech Sound Disorders using Ultrasound Visual Biofeedback. Journal of Speech, Language, and Hearing Research.
 Ribeiro, M. S., Eshky, A., Richmond, K., Renals, S., (2019). Ultrasound tongue imaging for diarization and alignment of child speech therapy sessions. Proceedings of INTERSPEECH. Graz, Austria.