Text-Speech Alignment: A Robin Hood Approach for Endangered Languages

Poster Session

Title

Text-Speech Alignment: A Robin Hood Approach for Endangered Languages

Presenter/Creator Information

Claire Bowern, Yale UniversityFollow
Rikker Dockum, Yale UniversityFollow
Sarah Babinski, Yale UniversityFollow
Hunter Craft, Yale UniversityFollow
Anelisa Fergus, Yale UniversityFollow
Dolly Goldenberg, Yale UniversityFollow

Website

http://pamanyungan.net

Description

Forced alignment automatically aligns audio recordings of spoken language with transcripts at the level of individual sounds, greatly reducing the time required to prepare data for linguistic analysis. However, existing algorithms are mostly trained on a few well-documented languages. We test the performance of three algorithms against manually aligned data on data from a highly endangered language. At least some tasks, unsupervised alignment (either based on English or trained from a small corpus) is sufficiently reliable for it to be used on legacy data for low-resource languages. Descriptive phonetic work on vowel inventories and prosody can be accurately captured by automatic alignment with minimal training data. Underutilized legacy data exist for many endangered languages. This creates both a need and an opportunity to leverage new technology.

Download

Included in

Language Description and Documentation Commons, Phonetics and Phonology Commons

COinS

Text-Speech Alignment: A Robin Hood Approach for Endangered Languages

https://elischolar.library.yale.edu/dayofdata/2018/posters/13

Poster Session

Title

Presenter/Creator Information

Website

Description

Included in

Browse

Contribute

Poster Session

Title

Presenter/Creator Information

Website

Description

Included in

Share

Browse

Contribute