ACL RD-TEC 1.0 Summarization of P06-1085
Paper Title:
CONTEXTUAL DEPENDENCIES IN UNSUPERVISED WORD SEGMENTATION
CONTEXTUAL DEPENDENCIES IN UNSUPERVISED WORD SEGMENTATION
Authors: Sharon Goldwater and Thomas L. Griffiths and Mark Johnson
Primarily assigned technology terms:
- algorithm
- approximate search
- bayesian approach
- bayesian word segmentation
- computational linguistics
- database
- factorization
- gibbs sampler
- gibbs sampling
- incremental search
- iterative procedure
- lexical acquisition
- maximum likelihood
- modeling
- online search
- processing
- sampling
- search
- search algorithm
- search technique
- segmentation
- stochastic process
- suboptimal search
- transcription
- two-stage modeling
- unsupervised word segmentation
- word segmentation
Other assigned terms:
- approach
- association for computational linguistics
- backoff
- bigram
- bigram language model
- bigram model
- boundary marker
- cache
- case
- characters
- childes database
- collocation
- conditional probability
- convergence
- corpora
- correlation
- dictionary
- dirichlet distribution
- distribution
- f-score
- fact
- generative model
- grammar
- hypotheses
- hypothesis
- hypothesis space
- language model
- lexical entries
- lexical entry
- lexical items
- lexicon
- likelihood
- linguistics
- markov chain
- measures
- method
- multinomial distribution
- n-gram
- n-gram model
- n-gram models
- natural language
- parameter settings
- phoneme
- phonemes
- phonemic representation
- phonemic transcription
- posterior
- posterior distribution
- posterior probability
- precision
- prior probability
- priori
- probabilistic models
- probabilities
- probability
- probability distribution
- procedure
- process
- process model
- search procedure
- segmentation accuracy
- statistics
- stems
- technique
- term
- terms
- text
- token frequency
- tokens
- trigram
- uniform distribution
- unigram
- unigram language model
- unigram model
- utterance
- word
- word boundaries
- word boundary
- word frequencies
- word type
- word types
- words