ACL RD-TEC 1.0 Summarization of W98-1210
Paper Title:
FINDING STRUCTURE VIA COMPRESSION
FINDING STRUCTURE VIA COMPRESSION
Authors: Jason L. Hutchens and Michael D. Alder
Primarily assigned technology terms:
Other assigned terms:
- alphabet
- approach
- characters
- chunk
- chunks
- corpora
- distribution
- english language
- english text
- entropy
- fact
- language corpora
- language corpus
- language model
- language models
- lexeme
- markov models
- measure
- natural language
- natural language corpora
- natural language text
- perplexity
- probability
- probability distribution
- process
- punctuation
- sentence
- statistical language model
- statistical model
- stress
- symbol
- symbols
- technique
- testing corpus
- text
- training
- training corpus
- word
- word boundaries
- words