ACL RD-TEC 1.0 Summarization of E06-2001

Paper Title:
LARGE LINGUISTICALLY-PROCESSED WEB CORPORA FOR MULTIPLE LANGUAGES

Authors: Marco Baroni and Adam Kilgarriff

Other assigned terms:

  • annotated corpus
  • annotation
  • apa corpus
  • association measure
  • bias
  • bigram
  • bigram language model
  • case
  • content words
  • corpora
  • corpus design
  • corpus size
  • data sparseness
  • dictionary
  • dictionary definitions
  • disk
  • distribution
  • document
  • function word
  • function words
  • genre
  • german corpus
  • graph structure
  • keyword
  • language model
  • large corpora
  • lexicography
  • linguistic
  • linguistic corpora
  • linguistic data
  • linguists
  • log-likelihood
  • log-likelihood ratio
  • log-likelihood ratio association
  • machine-generated text
  • markup
  • measure
  • method
  • methodology
  • n-grams
  • named entities
  • navigational information
  • nouns
  • part-of-speech
  • part-of-speech tag
  • part-of-speech tags
  • particles
  • parts-of-speech
  • precision
  • procedure
  • processing time
  • queries
  • query
  • regular expressions
  • relation
  • seed
  • sentence
  • sentences
  • server
  • statistics
  • suffixes
  • tags
  • target language
  • temporal expressions
  • terms
  • text
  • tokens
  • topics
  • user
  • vocabulary
  • web corpus
  • web documents
  • web page
  • web pages
  • word
  • word types
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***