You are now in section:
Corpora
Learner Corpora
Br-ICLE
Longman Learners' Corpus
CLC
PICLE
ICLE
../ Brazilian-Portugese sub-corpus of ICLE (Br-ICLE)
Developed by:
Tony Berber Sardinha (Catholic University in Sao Paulo) and Stella O. Tagnin (University in Sao Paulo)
Size:
40834 tokens (as of April 2002) Target size: 200,000; each text sample is between 500 and 1000 words (which is the max. one student can contribute)
Contents:
essay writing by learners of English
Access:
Access to the corpus is restricted to authorised users;
Notes:
the website offers wordlists and concordances for the 100 most frequent words of the corpus
../ CLC - Cambridge Learner Corpus
Developed by:
Cambridge University Press
Size:
~ 20 million words
Contents:
Scripts from ~50,000 students from over 100 different first languages and 150 different countries
Access:
Access is currently restricted to members of Cambridge University Press
Notes:
The CLC is part of the CIC. Part of the CLC has been coded with a
Learner Error Coding system
.
../ International Corpus of Learner English (ICLE)
Developed by:
Sylviane Granger ( Louvain Centre for English Corpus Linguistics)
Size:
2 million words
Contents:
"[written materials] by learners of English from 19 different mother tongue backgrounds"
Access:
Available on the
ICLE-CD ROM
Notes:
The ICLE has a number of subcorpora, some of which are listed on this page.
../ Longman Learners' Corpus
Developed by:
Longman - Longman Corpus Network
Size:
10 million words
Contents:
Written language (student essays)
Access:
Restricted to Longman
Notes:
Teachers can send in student essays; more info on their website
../ Polish sub-corpus of ICLE (PICLE)
Developed by:
Przemyslaw Kaszubski
Size:
330,000 words as of March 2002
Contents:
written essays by learners of English
Access:
Online search available
Notes:
Also available on the
ICLE-CD ROM
; PICLE is available in plain text and in a tagged version