Felix Bildhauer & Roland Schäfer (2017). Induktive Topikmodellierung und extrinsische Topikdomänen (final draft). In: Marek Konopka & Angelika Wöllstein. Grammatische Variation – Empirische Zugänge und theoretische Modellierung. Berlin/Boston: Mouton De Gruyter. 331–344. [BibTeX]
Category Archives: Papers
Automatic Classification by Topic Domain for Meta Data Generation, Web Corpus Evaluation, and Corpus Comparison (Proc WAC)
On Bias-free Crawling and Representative Web Corpora (Proc WAC)
Accurate and Efficient General-Purpose Boilerplate Detection for Crawled Web Corpora (LREV)
CommonCOW: Massively Huge Web Corpora from CommonCrawl Data and a Method to Distribute them Freely under Restrictive EU Copyright Laws (Proc LREC)
Processing and Querying Large Web Corpora with the COW14 Architecture (Proc CMLC)
Roland Schäfer. Processing and Querying Large Web Corpora with the COW14 Architecture. In Proceedings of Challenges in the Management of Large Corpora (CMLC-3) (IDS publication server). 28–34. [BibTeX]
Die Kurzformen des Indefinitartikels im Deutschen (ZS)
Roland Schäfer & Ulrike Sayatz (2014) Die Kurzformen des Indefinitartikels im Deutschen (Cliticization of the indefinite article in German). Zeitschrift für Sprachwissenschaft (ZS) 33(2). [BibTeX]