ACL RD-TEC 1.0 Summarization of W06-1701

Paper Title:
WEB-BASED FREQUENCY DICTIONARIES FOR MEDIUM DENSITY LANGUAGES

Authors: András Kornai and Péter Halácsy and Viktor Nagy and Csaba Oravecz and Viktor Trón and Dániel Varga

Other assigned terms:

  • ambiguity
  • annotation
  • authorship
  • bias
  • bigram
  • brown corpus
  • cache
  • capitalization information
  • case
  • characters
  • coefficient
  • coherence
  • community
  • compounds
  • corpora
  • corpus size
  • derivational morphology
  • dictionaries
  • dictionary
  • disambiguation system
  • disambiguation task
  • distribution
  • entropy
  • fact
  • frequency counts
  • frequency distribution
  • frequency list
  • generative models
  • genre
  • grammaticality
  • hypothesis
  • inflectional morphology
  • labeling
  • language models
  • language usage
  • lemma
  • lexicon
  • linguistic
  • linguistic data
  • linguistic information
  • linguistics
  • long distance dependencies
  • manual tagging
  • markov models
  • meanings
  • measure
  • measures
  • method
  • monolingual corpora
  • morpheme
  • morphemes
  • morphological annotation
  • morphological lexicon
  • n-gram
  • nlp tasks
  • noun phrases
  • nouns
  • opennlp package
  • parallel corpus
  • pos information
  • precision
  • probabilistic model
  • probabilities
  • probability
  • process
  • punctuation
  • quantifier
  • queries
  • query
  • reuters corpus
  • sentence
  • sentences
  • statistics
  • stem
  • stems
  • style
  • suffix
  • surface form
  • syntax
  • tag model
  • tagging model
  • technology
  • test corpus
  • test set
  • text
  • theoretical linguistics
  • theory
  • tokens
  • topics
  • training
  • training corpora
  • training corpus
  • training material
  • trigram
  • unigram
  • vocabulary
  • vowel
  • web corpus
  • word
  • word form
  • word frequency
  • word usage
  • wordform
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***