Project

Back to overview

A Crowdsourcing Platform for Spoken CALL Content

English title A Crowdsourcing Platform for Spoken CALL Content
Applicant Rayner Emmanuel
Number 177065
Funding scheme COST (European Cooperation in Science and Technology)
Research institution Faculté de traduction et d'interprétation Traitement Informatique Multilingue Université de Genève
Institution of higher education University of Geneva - GE
Main discipline Applied linguistics
Start/End 01.04.2018 - 30.11.2021
Approved amount 298'683.00
Show all

Keywords (4)

spoken language applications; Computer-Assisted Language Learning; crowdsourcing; web

Lay Summary (French)

Lead
L'objectif de ce projet est de développer une plate-forme de crowdsourcing pour créer du contenu pédagogique. La plate-forme logicielle utilisée pour créer et héberger ce contenu sera une version étendue de CALL-SLT Lite (http://callslt.unige.ch/demos-and-resources/), développée dans le cadre d'un projet antérieur financé par le FNS. Comme CALL-SLT Lite a été conçue dès le début dans l’optique du crowdsourcing, l'extension la plus importante sera une interface qui facilite la construction des cours par des utilisateurs non experts.
Lay summary

La tâche principale du projet consistera à mettre en place l'infrastructure nécessaire, en particulier les techniques d’incitation et de récompense pour motiver la communauté, ainsi que les métriques pour suivre les progrès. Sur la base d’autres expériences réussies de ce tyoe (Wikipedia, Amazon, OpenStreetMap, Goodreads etc.), nous pensons qu'il devrait être possible de reprendre des techniques connues comme les votes, la médiation ou la gamification pour donner l'élan nécessaire. Le projet sera étroitement intégré au réseau COST d'enetCollect (http://enetcollect.eurac.edu/), dont l’UNIGE est un partenaire actif. Il visera tout d'abord à recruter les créateurs de contenu et  les utilisateurs auprès des membres d'enetCollect. L'architecture du réseau de crowdsourcing sera conçue de manière à pouvoir intégrer facilement d'autres plates-formes d’apprentissage des langues, notamment celles développées par d'autres groupes d’enetCollect.

L'échéancier du projet prévoit la mise en place du réseau au cours de la première année, puis la révision de l'architecture au cours des deuxième et troisième années en réponse aux commentaires des utilisateurs.

 

Direct link to Lay Summary Last update: 16.01.2018

Responsible applicant and co-applicants

Employees

Publications

Publication
LARA in the Service of Revivalistics and Documentary Linguistics: Community Engagement and Endangered Languages
Zuckerman Ghil‘ad, Vigfússon Sigurður, Rayner Manny, Ní Chiaráin Neasa, Ivanova Nedelina, Habibi Hanieh, Bédi Branislav (2021), LARA in the Service of Revivalistics and Documentary Linguistics: Community Engagement and Endangered Languages, in Proceedings of the Workshop on Computational Methods for Endangered Languages, 1(2), 13-23.
Easy construction of multimedia online language textbooks and linguistics papers with LARA
Butterweck Matthias, Chua Cathy, Habibi Hanieh, Rayner Manny, Zuckermann Ghil'ad (2019), Easy construction of multimedia online language textbooks and linguistics papers with LARA, in 12th annual International Conference of Education, Research and Innovation, Seville, SpainIATED, Seville, Spain.
LARA portal: a tool for teachers to develop interactive text content, an environment for students to improve reading skills
Habibi Hanieh (2019), LARA portal: a tool for teachers to develop interactive text content, an environment for students to improve reading skills, in 12th annual International Conference of Education, Research and Innovation, Seville, SpainIATED, Seville, Spain.
Vegetarian vampires: why the CALL technology provider doesn't have to suck the teacher's blood
Chua Cathy, Rayner Manny (2019), Vegetarian vampires: why the CALL technology provider doesn't have to suck the teacher's blood, in 12th annual International Conference of Education, Research and Innovation, Seville, SpainIATED, Seville, Spain.
Demonstration of LARA: A Learning and Reading Assistant
AkhlaghiElham, BédiBranislav, ButterweckMatthias, ChuaCathy, GerlachJohanna, HabibiHanieh, IkedaJunta, RaynerManny, SestigianiSabina, ZuckermannGhil'ad (2019), Demonstration of LARA: A Learning and Reading Assistant, in Proceedings of the 8th ISCA Workshop on Speech and Language Technology in Education (SLaTE 2019), Graz, AustriaISCA, Graz, Austria.
Overview of LARA: A Learning and Reading Assistant
AkhlaghiElham, BédiBranislav, ButterweckMatthias, ChuaCathy, GerlachJohanna, HabibiHanieh, IkedaJunta, RaynerManny, SestigianiSabina (2019), Overview of LARA: A Learning and Reading Assistant, in Proceedings of the 8th ISCA Workshop on Speech and Language Technology in Education (SLaTE 2019), ISCA, Graz, Austria.
Overview of the 2019 Spoken CALL Shared Task
BaurClaudia, CainesAndrew, ChuaCathy, GerlachJohanna, QianMengjie, RaynerManny, RussellMartin, StrikHelmer, WeiXizi (2019), Overview of the 2019 Spoken CALL Shared Task, in Proceedings of the 8th ISCA Workshop on Speech and Language Technology in Education (SLaTE 2019), Graz, AustriaISCA, Graz, Austria.
Using LARA for language learning: a pilot study for Icelandic
BédiBranislav, ChuaCathy, HabibiHanieh, Martinez-LopezRuth, RaynerManny (2019), Using LARA for language learning: a pilot study for Icelandic, in CALL and complexity – short papers from EUROCALL 2019, Louvain la Neuve, BelgiumResearch-publishing.net, Louvain la Neuve, Belgium.
Alexa as a CALL platform for children: Where do we start?
Tsourakis Nikos, Rayner Manny, Habibi Hanieh, Gallais Pierre-Emmanuel, Chua Cathy, Butterweck Matt (2019), Alexa as a CALL platform for children: Where do we start?, in Proceedings of the enetCollect WG3/WG5 workshop, Leiden, HollandCEUR-WS, Internet.
Decentralising power: how we are trying to keep CALLector ethical
Chua Cathy, Habibi Hanieh, Rayner Manny, Tsourakis Nikos (2019), Decentralising power: how we are trying to keep CALLector ethical, in Proceedings of the enetCollect WG3/WG5 workshop, CEUR-WS, Internet.
What do the founders of online communities owe to their users?
Chua Cathy, Rayner Manny (2019), What do the founders of online communities owe to their users?, in Proceedings of the enetCollect WG3/WG5 workshop, Leiden, HollandCEUR-WS, Internet.
A Robust Context-Dependent Speech-to-Speech Phraselator Toolkit for Alexa
RaynerManny, TsourakisNikos, StanekJan (2018), A Robust Context-Dependent Speech-to-Speech Phraselator Toolkit for Alexa, in Proceedings of Interspeech 2018, Hyderabad, IndiaIEEE, Hyderabad, India.
Overview of the 2018 Spoken CALL Shared Task
BaurClaudia, CainesAndrew, ChuaCathy, GerlachJohanna, QianMengjie, RaynerManny, RussellMartin, StrikHelmer, WeiXizi (2018), Overview of the 2018 Spoken CALL Shared Task, in Proceedings of Interspeech 2018, Hyderabad, IndiaIEEE, Hyderabad, India.
Assessing the Quality of TTS Audio in the LARA Learning-by-Reading Platform
AkhlaghiElham, BączkowskaAnna, BerthelsenHarald, BédiBranislav, ChuaCathy, CucchiariniCatia, HabibiHanieh, HorváthováIvana, HvalsøePernille, LotzRoy, MaizonniauxChristèle, Ní ChiaráinNeasa, RaynerManny, TsourakisNikos, YaoChunlin, Assessing the Quality of TTS Audio in the LARA Learning-by-Reading Platform, in Short papers from EUROCALL 2021, Paris, Franceresearch-publishing.net, Paris, France.
Constructing LARA Content
RaynerManny, ButterweckMatt, HabibiHanieh, ChuaCathy, Constructing LARA Content, University of Geneva, Geneva.
Constructing Multimodal Language Learner Texts Using LARA: Experiences with Nine Languages
AkhlaghiElham, BédiBranislav, BektaşFatih, BerthelsenHarald, ButterweckMatthias, ChuaCathy, CucchiariniCatia, EryiğitGülşen, GerlachJohanna, HabibiHanieh, Ní ChiaráinNeasa, RaynerManny, SteingrímssonSteinþór, StrikHelmer, Constructing Multimodal Language Learner Texts Using LARA: Experiences with Nine Languages, in Proc. LREC 2020, Marseille, FranceELRA, Marseille, France.
Learning Old Norse through digital technologies: using LARA to build an online Völuspá
BédiBranislav, BernharðssonHaraldur, ChuaCathy, RaynerManny, Learning Old Norse through digital technologies: using LARA to build an online Völuspá, in Proc IASS 2020, IASS, Vilnius, Lithuania.
Teaching Old Norse in LARA
BédiBranislav, BernharðssonHaraldur, ChuaCathy, RaynerManny, Teaching Old Norse in LARA, in Proceedings of EuroCALL 2020, Copenhagen, DenmarkEuroCALL, Copenhagen, Denmark.
Using LARA for Second Language Teaching in Iran: A pilot study in Farsi and English
AkbariOmid, AkhlaghiElham, ChuaCathy, HabibiHanieh, Martinez-LopezRuth, RaynerManny, Using LARA for Second Language Teaching in Iran: A pilot study in Farsi and English, in Proceedings of EuroCALL 2020, Research-publishing.net, Copenhagen, Denmark.

Datasets

Spoken CALL Shared Task, second edition

Author Baur, Claudia; Caines, Andrew; Chua, Cathy; Gerlach, Johanna; Rayner, Manny
Publication date 07.02.2018
Persistent Identifier (PID) Spoken CALL Shared Task, second edition
Repository Geneva University website
Abstract
The Spoken CALL Shared Task is an initiative to create an open challenge dataset for speech-enabled CALL systems, jointly organised by the University of Geneva, the University of Birmingham, Radboud University and the University of Cambridge. The task is based on data collected from a speech-enabled online tool which has been used to help young Swiss German teens practise skills in English conversation. Items are prompt-response pairs, where the prompt is a piece of German text and the response is a recorded English audio file. The task is to label pairs as “accept” or “reject”, accepting responses which are grammatically and linguistically correct to match a set of hidden gold standard answers as closely as possible. Resources are provided so that a scratch system can be constructed with a minimal investment of effort, and in particular without necessarily using a speech recogniser.The first edition of the task was announced at LREC 2016, with training data released in July 2016 and test data in March 2017, and attracted 20 entries from 9 groups. Results, including seven papers, were presented at the SLaTE workshop in August 2017. Full details, including links to resources, results and papers, can be found on the Shared Task home page.Following the success of the original task, we are organising a second edition. We will approximately double the amount of training data, provide new test data, and release improved versions of the accompanying resources. In particular, we will make generally available the open source Kaldi recogniser developed by the University of Birmingham, which achieved the best performance on the original task, together with versions of the training and test data pre-processed through this recogniser. Results will be presented in a special session at Interspeech 2018.

Spoken CALL Shared Task, third edition

Author Baur, Claudia; Caines, Andrew; Chua, Cathy; Gerlach, Johanna; Rayner, Manny
Persistent Identifier (PID) Spoken CALL Shared Task, third edition
Repository Spoken CALL Shared Task, third edition
Abstract
The Spoken CALL Shared Task is an initiative to create an open challenge dataset for speech-enabled CALL systems, jointly organised by the University of Geneva, the University of Birmingham, Radboud University and the University of Cambridge. The task is based on data collected from a speech-enabled online tool which has been used to help young Swiss German teens practise skills in English conversation. Items are prompt-response pairs, where the prompt is a piece of German text and the response is a recorded English audio file. The task is to label pairs as “accept” or “reject”, accepting responses which are grammatically and linguistically correct to match a set of hidden gold standard answers as closely as possible. Resources are provided so that a scratch system can be constructed with a minimal investment of effort, and in particular without necessarily using a speech recogniser.The first edition of the Shared Task was carried out in 2017, with results presented at the SLaTE 2017 workshop in Stockholm. The second edition, with improved training data and improved baseline recogniser resources, was carried out in 2018, results this time being presented at a special session of Interspeech 2018 in Hyderabad. Details, including full results and links to all the papers, are available from the Shared Task 1 site and the Shared Task 2 site.The third edition of the Shared Task will make available the same training data and resources as the second edition. There will be new test data. Given the strong results reported in the second edition, we are also making an important change: THE THIRD EDITION WILL USE A NEW METRIC. This new metric, Dfull, is defined in the instructions tab and motivated in §5 of the Interspeech 2018 overview paper. Unlike the metric used in the two previous editions of the Shared Task, which focused on optimizing performance for correct student responses (i.e. responses which should be accepted), Dfull places equal weight on correct and incorrect utterances. Since incorrect responses are considerably harder to process than correct ones, we expect Dfull to pose interesting new challenges.

Full data for "Assessing the Quality of TTS Audio in the LARA Learning-by-Reading Platform"

Author Akhlaghi, Elham; Bączkowska, Anna; Berthelsen, Harald; Bédi, Branislav; Chua, Cathy; Cucchiarini, Catia; Habibi, Hanieh; Horváthová, Ivana; Hvalsøe, Pernille; Lotz, Roy; Maizonniaux, Christèle; Ní Chiaráin, Neasa; Rayner, Manny; Tsourakis, Nikos; Yao, Chunlin
Publication date 20.08.2021
Persistent Identifier (PID) not yet available
Repository CALLector project website
Abstract
A popular idea in CALL is to use multimodal annotated texts, with annotations typically including embedded audio and translations, to support L2 learning through reading. An important question is how to create the audio, which can be done either through human recording or by a TTS engine. We may reasonably expect TTS to be quicker and easier, but human to be of higher quality. Here, we report a study using the open source LARA platform and ten languages. Samples of LARA audio totalling about 3.5 minutes were provided for each language in both human and TTS form; subjects used a web form to compare different versions of the same item and rate the voices as a whole. Although human voice was more often preferred, TTS achieved higher ratings in some languages and was close in others. Links to the relevant LARA texts, the data collection form and the full results.

Collaboration

Group / person Country
Types of collaboration
Christèle Maizonniaux, Flinders University Australia (Oceania)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
Anna Baczkowska, University of Gdansk Poland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
Elham Akhlaghi, Ferdowsi University Of Mashhad Iran (Asia)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Ghil'ad Zuckermann, University of Adelaide Australia (Oceania)
- in-depth/constructive exchanges on approaches, methods or results
- Research Infrastructure
Pierre-Emmanuel Gallais, Independent scholar France (Europe)
- Publication
- Industry/business/other use-inspired collaboration
Sabina Sestigiani, Swinburne University Australia (Oceania)
- in-depth/constructive exchanges on approaches, methods or results
- Research Infrastructure
Rina Zviel-Girshin, Ruppin Academic Center Israel (Asia)
- in-depth/constructive exchanges on approaches, methods or results
- Research Infrastructure
Branislav Bédi, The Árni Magnússon Institute for Icelandic Studies Iceland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Steinþór Steingrímsson, The Árni Magnússon Institute for Icelandic Studies Iceland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Chadi Raheb, Independent scholar Iran (Asia)
- in-depth/constructive exchanges on approaches, methods or results
- Research Infrastructure
Marta Mykhats, Independent scholar Ukraine (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Research Infrastructure
Katerina Zourou, Web2Learn Greece (Europe)
- in-depth/constructive exchanges on approaches, methods or results
Ivana Horváthová, Univerzita Konstantina Filozofa Slovakia (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
Hakeem Beedar, University of Adelaide Australia (Oceania)
- in-depth/constructive exchanges on approaches, methods or results
- Research Infrastructure
Roy Lotz, Independent scholar Spain (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Martin Russell, University of Birmingham Great Britain and Northern Ireland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Annabel Keigwin, Independent scholar Switzerland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Industry/business/other use-inspired collaboration
Helmer Strik Netherlands (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Ruth Martinez-Lopez, Samara University Russia (Europe)
- Industry/business/other use-inspired collaboration
Cathy Chua,Adelaide University Australia (Oceania)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Yao Chunlin, Tianjin Chengjian University China (Asia)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
Andrew Caines, University of Cambridge Great Britain and Northern Ireland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Matthias Butterweck, Independent scholar Germany (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Pernille Hvalsøe, University of Copenhagen Denmark (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Xizi Wei, University of Birmingham Great Britain and Northern Ireland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Mengjie Qian, University of Birmingham Great Britain and Northern Ireland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Junta Ikeda, Independent scholar Australia (Oceania)
- Industry/business/other use-inspired collaboration
Neasa Ní Chiaráin, Trinity College, Dublin Ireland (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure
Gülsen Eryigit, Istanbul Technical University, Istanbul Turkey (Europe)
- in-depth/constructive exchanges on approaches, methods or results
- Publication
- Research Infrastructure

Scientific events

Active participation

Title Type of contribution Title of article or contribution Date Place Persons involved
ICERI 2019 Poster LARA Portal: A Tool for Teachers to Develop Interactive Text Content, an Environment for Students To Improve Reading Skill 11.11.2019 Seville, Spain Habibi Hanieh;
SLaTE 2019 Talk given at a conference Overview of LARA: A Learning and Reading Assistant 20.09.2019 Graz, Austria Rayner Emmanuel;
EuroCALL 2020 Talk given at a conference Using LARA for learning Icelandic 28.08.2019 Louvain-la-neuve, Belgium Rayner Emmanuel;
enetCollect 3rd Annual Meeting Talk given at a conference LARA – the Learning and Reading Assistant 14.03.2019 Lisbon, Portugal Bouillon Pierrette; Tsourakis Nikos;
enetCollect WG3/WG5 workshop Talk given at a conference Decentralising power: how we are trying to keep CALLector ethical 24.10.2018 Leiden, Netherlands Rayner Emmanuel; Tsourakis Nikos;
Interspeech 2018 Talk given at a conference Overview of the 2018 Spoken CALL Shared Task. 02.09.2018 Hyderabad, India Rayner Emmanuel;


Knowledge transfer events



Self-organised

Title Date Place

Use-inspired outputs

Software

Name Year


Associated projects

Number Title Start Funding scheme
204503 Extending the LARA Learning-by-Reading Platform 01.02.2022 Project funding

Abstract

We propose a three year project, whose central goal is to develop a crowdsourcing network capable of creating large quantities of content for speech-enabled Computer-Assisted Language Learning (CALL). The software platform used to create and host the content would be an extended version of the CALL-SLT Lite platform, developed under an SNSF-sponsored project which has just ended. CALL-SLT Lite has been designed with crowdsourcing in mind as a possible long-term goal; the modifications required are quite small, mainly consisting of a better ``Wizard''-style web interface.The main thrust of the project is concerned with setting up the required crowdsourcing infrastructure, in particular developing the system of incentives and rewards needed to motivate the content-producers, together with a set of metrics that can be used to track progress. Extrapolating from the histories of successful networks of this kind (Wikipedia, Amazon, Goodreads etc), we guess that a well-designed framework using likes/votes, mediation of comments and gamification will be enough to establish the necessary momentum. The project would be closely integrated with the enetCollect COST network, and where our group plays a leading role; it would initially aim to recruit both content-producers and content-consumers from enetCollect. It is indeed precisely the existence of this highly multinational consortium, which already unites hundreds of people interested in crowdsourcing of CALL content, which makes us optimistic that our goals can be attained. The architecture of the crowdsourcing network will be designed so that other CALL platforms (in particular, platforms developed by other groups in enetCollect) can easily be integrated.The timeline of the project envisages setting up the basic crowdsourcing network during the first year, recruiting producers and consumers from enetCollect, then revising the architecture during the second year in response to user feedback. Work during the last third of the project of the project will depend on how things go during the first two years. If, as we hope, the network grows well, we will naturally aim to keep it moving in a good direction; this will almost certainly involve increased collaboration with other enetCollect groups. If it does not, we will replan appropriately.A successful outcome to this project could have a substantial impact on the development of CALL technology for the whole of Europe.
-