UltraPhonix
A dataset of ultrasound and audio recordings from children with speech sound disorders
Speakers
The UltraPhonix dataset contains 20 speakers (16 male, 4 female), aged 6-13 years.
For a list and additional details, see UPX Speakers.
Sessions
Session | Description |
---|---|
Suit | Suitability session to determine if child needs speech therapy |
BL | Baseline session before therapy (1-2 sessions) |
Mid | Mid-point session, halfway through therapy |
Post | Post-therapy session, immediately after therapy ended |
Maint | Maintenance session, some time after therapy ended |
Therapy | Therapy sessions |
Data Types
Core data types
Data type | Description |
---|---|
wav | speech waveform |
ult | raw ultrasound data |
param | ultrasound parameters |
txt | prompt text with date/time of utterance recording |
Additional data
Data type | Description |
---|---|
slt_labels | manual annotation from SLT, when available. See [2] for details |
speaker_labels | speaker diarization identifying therapist (SLT) and child (CHILD) speech |
word_labels | automatic word-level alignment |
phone_labels | automatic phone-level alignment |
Labels are available in Praat's TextGrid format and HTK's lab format.
Speaker, word, and phone labels were generated according to the methods described in [4].
File IDs
Individual recordings are indexed for each session according to their recording times. See the prompt text file for recording date/time.
Each file ID also includes a prompt type identifier. See Data for details.
References
[1] Eshky, A., Ribeiro, M. S., Cleland, J., Richmond, K., Roxburgh, Z., Scobbie, J., & Wrench, A. (2018) Ultrasuite: A repository of ultrasound and acoustic data from child speech therapy sessions. Proceedings of INTERSPEECH. Hyderabad, India.
[2] Cleland, J., Scobbie, J. M., Heyde, C., Roxburgh, Z., & Wrench, A. A. (2017). Covert contrast and covert errors in persistent velar fronting. Clinical linguistics & phonetics, 31(1), 35-55.
[3] Cleland, J., Scobbie, J. M., Roxburgh, Z., Heyde, C., & Wrench, A. A. (Under Revision). Enabling New Articulatory Gestures in Children with Persistent Speech Sound Disorders using Ultrasound Visual Biofeedback. Journal of Speech, Language, and Hearing Research.
[4] Ribeiro, M. S., Eshky, A., Richmond, K., Renals, S., (2019). Ultrasound tongue imaging for diarization and alignment of child speech therapy sessions. Proceedings of INTERSPEECH. Graz, Austria.