Back to overview

Assessing the Quality of TTS Audio in the LARA Learning-by-Reading Platform

Type of publication Peer-reviewed
Publikationsform Proceedings (peer-reviewed)
Author AkhlaghiElham, BączkowskaAnna, BerthelsenHarald, BédiBranislav, ChuaCathy, CucchiariniCatia , HabibiHanieh , HorváthováIvana , HvalsøePernille , LotzRoy, MaizonniauxChristèle , Ní ChiaráinNeasa , RaynerManny, TsourakisNikos, YaoChunlin,
Project A Crowdsourcing Platform for Spoken CALL Content
Show all

Proceedings (peer-reviewed)

Title of proceedings Short papers from EUROCALL 2021
Place Paris, France

Open Access


A popular idea in CALL is to use multimodal annotated texts, with annotations typically including embedded audio and translations, to support L2 learning through reading. An important question is how to create the audio, which can be done either through human recording or by a TTS engine. We may reasonably expect TTS to be quicker and easier, but human to be of higher quality. Here, we report a study using the open source LARA platform and ten languages. Samples of LARA audio totalling about 3.5 minutes were provided for each language in both human and TTS form; subjects used a web form to compare different versions of the same item and rate the voices as a whole. Although human voice was more often preferred, TTS achieved higher ratings in some languages and was close in others.