Category Archives: Talks

What is a good corpus? (Anke Lüdeling, Roland Schäfer, Elizabeth Pankratz, Thomas Krause, Felix Bildhauer, Felix Golcher)

Anke Lüdeling, Roland Schäfer, Elizabeth Pankratz, Thomas Krause, Felix Bildhauer, Felix Golcher. 2021. What is a good corpus? A series of three talks given in the colloquium of the CRC (SFB) 1412 Register at Humboldt University.

Part 1: What is a good corpus? Corpus Design

  1. Sampling
  2. Annotation

Part 2: Inference

  1. Philosophies of inference
  2. The Texas Marksman
  3. Weak or no error probing with corpora Some solutions
  4. It’s never “just another interface”! (INF)

Part 3: Corpus creation and corpus use

  1. Specialised corpora
  2. Web corpora

Corpora, Inference, and Models of Register Distributions (DGfS 2021, AG15)

Felix Bildhauer & Elizabeth Pankratz & Roland Schäfer (alphabetically). Corpora, Inference, and Models of Register Distribution. Accepted presentation at Contrastive corpus methodology for language modeling and analysis (AG 15) at the Annual Meeting of the German Linguistic Society, DGfS 2020. Freiburg, Cyberspace.

Bildhauer, Pankratz, Schäfer. DGfS Jahrestagung 2021. Corpora, Inference, and Models of Register Distributions (Handout, licensed under CC-BY-SA)

 

Beyond Multidimensional Analysis: probabilistic register induction for large corpora (DGfS-CL 2020)

Bildhauer & Schäfer Probabilistic Register Induction DGfS 2020Felix Bildhauer & Roland Schäfer. Beyond Multidimensional Analysis: probabilistic register induction for large corpora. Poster to be presented at DGfS-CL session at DGfS 2020 in Hamburg 3–6 March 2020. Download abstract (PDF). Klick on the image or this link to download the A0 poster.

 

 

COReX und COReCO: A lexico-grammatical document annotation framework for large German corpora (DGfS-CL 2017)

Felix Bildhauer & Roland Schäfer. COReX und COReCO: A lexico-grammatical document annotation framework for large German corpora. Poster to be presented at DGfS-CL session at DGfS 2017 in Saarbrücken 7–9 March 2017.