Description of Research Project A1.A2

A learner corpus is a ‘[…]  systematic  computerized  collections  of  texts  produced  by  language  learners’ (Nesselhauf,  2004:  125). These collections stem from real, spontaneous or directed language. Once they are digitally analysed, these outputs can be applied to curriculum design, material creation, development of dictionaries or learner support. The most famous example of learner corpus is the ICLE (learner English) at Lovaina university. Spanish learner corpora are still rare. The University of Santiago de Compostela in conjunction with the Instituto Cervantes are developing the Corpus de Aprendices de Español como lengua extranjera – CAES. The Spanish Learner Language Oral Corpora (SPLLOC) is being developed between three universities in the UK (Newcastle, Southampton and York). Ainciburu (2010) offers detailed descriptions of several projects that are building Spanish learner corpora, namely, the research group called Woslac’s Proyecto CEDEL2 at Universidad Autónoma de Madrid and the Spanish Learner Corpus and Exercises (SLCE) at University of Texas.  In Spain, Alcalá’s University is developing a written Corpus para el Análisis de errores de aprendices de E/LE (CORANE), with a variety of native languages. In Brasil, the USP Multilingual Learner Corpus (MLC) works with Spanish too. All in all, these projects have shown that learner corpora facilitate the study of written and oral learner language in context, accounting for their communicative competence and sociolinguistic variables. They can be replicated and tested from a linguistic and localised pedagogical perspective (Granger, Sylvianne, Gaëtanelle Gilquin & Fanny Meunie, 2015).

The main goals of this project are:

  • Establishing a small scale, high quality database of written and spoken learner Spanish in NUI Galway at level A1-A2;
  • Making this resource available to other researchers and inspire similar A1-A2 projects in other institutions in Ireland;
  • Enabling us to better understand the processes involved in learning Spanish in the Irish context and support curriculum design in Irish institutions.

Our research aims to answer the following questions: 

  1. What are the most frequent errors at level A1/A2 for Irish learners of Spanish?
  2. Where do this errors stem from and what type of errors are they?
  3. Does L1 or other languages that they have learned influence these errors?

These are questions that teachers encounter while assessing student oral and written production, as they try to come up with efficient methods of error repair. Our individual experiences in the small classroom fail to give us an overall picture of learner interlanguage. That is why a bigger case study, which entails digital processing of learner language, would help us design better materials that anticipate and target certain anomalous learner structures. Likewise, oral and written samples of learner language will be collected to account for the differences between these skills. For this study, the samples will be collected for research purposes with student consent. These students’ contributions will remain anonymous, although individuals will be profiled via a questionnaire about the languages that they speak and their gender in order to classify the samples along these two variables.

Learner  corpus  research has  relied  on  corpus  linguistics,  contrastive  analysis and error analysis. Analysis of learner  corpora  reveals  areas  where learners tend to underuse or overuse certain linguistic features as opposed to native-language users.  Thus, error tagging is inherent to learner corpora. The creation of error annotation manual is necessary and it involves a clearly defined taxonomy of errors alongside its tags. These taxonomies should be based on the observable data and well-defined linguistic categories or standardized to minimize the subjectivity involved in the process (Díaz-Negrillo and Fernández-Domínguez, 2006: 85). Different systems are currently being explored for this project, e.g. Free Text Error System by University of Louvain. A good system involves consistency, usability and flexibility, so new tags that indicate the differences between Irish learners and other learners of Spanish will be created. The linguistic levels covered are normally spelling, grammar and lexis, but phonetic tags will be included for our purposes.

REFERENCES

Ainciburu, C. (2010), «Al día», Revista Nebrija de Lingüística aplicada, n.º 7, http://www.nebrija.com/revista-linguistica/index.html (accessed 21st March, 2019)

Díaz-Negrillo, A. and J. Fernández-Domínguez, (2006) ‘Error Tagging Systems for Learner Corpora’, RESLA, 19: 83-102.

Granger, Sylvianne, Gaëtanelle Gilquin & Fanny Meunier. (2015), «Introduction: learner corpus research – past, present and future» in Sylvianne Granger, Gaëtanelle Gilquin & Fanny Meunier. The Cambridge Handbook of learner corpus research, 1-7. Cambridge: Cambridge University Press.

Nesselhauf, Nadja. 2004. «Learner corpora and their potential for language

teaching» in John Sinclair (ed.), How to Use Corpora in Language Teaching, 125

152. Amsterdam: Benjamins.

Some Examples of Spanish Learner Corpora Considered for Reduplication

  • Alvarez López, F. (2005): Corpus de textos académicos producidos por estudiantes universitarios extranjeros. (62 redacciones de exámenes)LINRED,n.º: 3
  • Boudali, I. (2005): Corpus de textos escritos por alumnos tunecinos de enseñanza secundaria estudiantes de E/LE. (23 redacciones),LINRED,n.º: 3
  • García, M. (2005): Corpus de Conversaciones en Español como Lengua Extranjera. (9 conversaciones entre estudiantes universitarios en Alemania), LINRED,n.º: 3
  • Grupo de investigación Woslac (2007) Proyecto CEDEL2. Universidad Autónoma de Madrid. Disponible en http://www.uam.es/woslac/cedel2.htm
  • Gilquin, Gaëtanelle, Szilvia Papp y María Belén Diez-Bedmar (Eds.) (2008). Linking up contrastive and learner corpus research. Amsterdam: Rodopi.
  • Gutiérrez Quintana, E. (2005): Corpus de textos escritos por universitarios italianos estudiantes de ELE(44 redacciones de estudiantes italianos en contexto académico),LINRED,n.º: 3
  • Koike, N. (2007): Spanish Learner Corpus and Exercises (SLCE) de la Universidad de Texas. Disponible en http://www.laits.utexas.edu/slce
  • Lin, Tzu-Ju (2005): Corpus de textos escritos por universitarios taiwaneses estudiantes de español (185 redacciones de estudiantes en Taiwan), LINRED,n.º: 3
  • Mitchell, R. (2008). SPLLOC: A new database for Spanish second language acquisition research. University of Southampton.
  • Penadés Martínez, I. (2005): «Corpus para el análisis de errores en el aprendizaje de E/LE: Presentación», en LINRED: Revista electrónica de lingüística Informaciones sobre cuestiones lingüísticas, n.º 3, disponible en http://www.linred.es/informacion_pdf/informacion8_19092005.pdf
  • Tagnin, S. (2002) USP Multilingual Learner Corpus (MLC). Departamento de Lenguas modernas de la Universidad de San Pablo (Brasil)
  • Tracy-Ventura, N. (2008). Spanish Learner Language Oral Corpus project (SPLLOC 1). A New Corpus of Oral L2 Spanish. Actualmente en http://www.splloc.soton.ac.uk/ y en Talkbank (http://www.talkbank.org/).

Published by eleineirinn

Primer Simposio Internacional sobre Enseñanza del Español como Lengua Extranjera

Leave a comment

Design a site like this with WordPress.com
Get started