ACL RD-TEC 1.0 Summarization of W97-0122

Paper Title:
USING WORD FREQUENCY LISTS TO MEASURE CORPUS HOMOGENEITY AND SIMILARITY BETWEEN CORPORA

Other assigned terms:

  • american english
  • approach
  • break
  • british english
  • british national corpus
  • brown corpus
  • case
  • chunks
  • community
  • computer program
  • contingency table
  • conversation
  • corpora
  • corpus similarity and homogeneity
  • correlation
  • document
  • document frequency
  • email
  • entropy
  • evaluation methodology
  • events
  • fact
  • frequency list
  • genre
  • hypotheses
  • hypothesis
  • interpretation
  • inverse document frequency
  • language corpora
  • language model
  • language type
  • large corpora
  • lexicography
  • linguistic
  • linguistic features
  • linguistic structures
  • linguistic theory
  • linguistic variation
  • linguistics
  • linguists
  • lob corpus
  • meaning
  • meanings
  • measure
  • measures
  • message
  • method
  • methodology
  • mutual information
  • null hypothesis
  • paragraph
  • parsed corpus
  • parts-of-speech
  • perplexity
  • punctuation
  • rank correlation
  • relation
  • relative clauses
  • representations
  • similarity scores
  • spearman rank correlation
  • statistic
  • statistics
  • subcorpus
  • sublanguage
  • subtree
  • subtrees
  • syntactic categories
  • syntactic category
  • syntactic constructions
  • term
  • terms
  • text
  • text encoding initiative
  • text type
  • textbook
  • theory
  • transcript
  • word
  • word frequencies
  • word frequency
  • word senses
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***