home complete list a - z parallel corpora learner corpora historical corpora spoken corpora ice corpora more languages german corpora english corpora search
You are now in section: CorporaGerman Corpora
     
 
COSMAS 2 Negr@
DWDS TIGER
LIMAS  

../ COSMAS 2: Corpus Storage, Maintenance and Access System
Developed by: Hosted at the IDS, Mannheim, Germany
Size:   /
Contents:   COSMAS hosts a great variety of corpora, click here for a list.
Access:   via public www access
Notes:   Version 3.4.2 of COSMAS-II-Clients is now available (Jan 2005)
    to the top of the page

../ DWDS - Das Digitale Wörterbuch des 20. Jahrhunderts
Developed by: DWDS
Size:   Several Corpora
Contents:   "Kerncorpus": written texts (1900-2000);
Extended corpus (internal access only): 1 billion words (mostly newspaper)
ZEIT Corpus: (monitor corpus): all issues 1996 - Today
DDR Corpus: 9 million words; written (1949-1990)
Jüdische Periodika: 26 million words; 8 journals
...
Access:   Online Search available free of charge
Notes:   Search engine Dialing/DWDS-Concordancer available for download (GNU Licence)
    to the top of the page

../ LIMAS - Linguistik und Maschinelle Sprachbearbeitung
Developed by: Forschungsgruppe LIMAS (Bonn/ Regensburg)
Size:   approx. 1 million words
Contents:   various texts from 33 different areas (1970 and 1971)
Access:   Free online access
Notes:    
    to the top of the page

../ Negr@
A Syntactically Annotated Corpus of German Newspaper Texts
Developed by: Saarland University, Germany
Size:   176,000 tokens (10,000 sentences) 
Contents:   German newspaper text (1990s)
Access:   free access for scientific use
Notes:   POS tagged and syntactically annotated
    to the top of the page

 

../ TIGER Corpus
Developed by: Department of Computational Linguistics and Phonetics in Saarbrücken; Institute of Natural Language Processing (IMS) in Stuttgart; Institut für Germanistik in Potsdam
Size:   700,000 tokens (40,000 sentences)
Contents:   Newspaper text (Frankfurter Rundschau)
Access:   Free access for research purposes; online search possible; software: TIGER Search
Notes:   The TIGER Corpus is a treebank and part of the TIGER Project.
    to the top of the page