Work on this project at Freie Universität Berlin, German Grammar Group / German and Dutch Philology, is supported by the German Research Council (Deutsche Forschungsgemeinschaft, DFG) grant SCHA1916/1-1.
Publications as of January 2019:
- Schäfer & Bildhauer (in prep.) on web characterisation and corpus comparison
- Bidlhauer & Schäfer (in prep.) on COReX and its usability in corpus studies
- Schäfer (2018) on corpora and cognitive representativity
- Schäfer & Pankratz (2018) on corpora and cognitive representativity
- Bildhauer & Schäfer (2017) on topic annotation
- Schäfer & Bildhauer (2016) on topic annotation
- Schäfer (2016b) on the ClaraX crawler
- Schäfer (2016a) on boilerplate detection
Software/data releases as of January 2019:
- DECOW16B corpus
- RanDECOW17 corpus
- COReX18 databases
- COWTek18 with COReX software
- ClaraX random walker with texrex
Principle investigator: Roland Schäfer
Funding amount: 286,100€
Runtime: January 2015 – June 2018 (interrupted April–September 2016)
Student assistants:
- Kim Maser, Humboldt-Universität Berlin (2015–2017)
- Luise Rißmann, Freie Universität Berlin (2015–2018)
Officially collaborating institutions: