ACL RD-TEC 1.0 Summarization of P06-2065
Paper Title:
UNSUPERVISED ANALYSIS FOR DECIPHERMENT PROBLEMS
UNSUPERVISED ANALYSIS FOR DECIPHERMENT PROBLEMS
Authors: Kevin Knight and Anish Nair and Nishit Rathod and Kenji Yamada
Primarily assigned technology terms:
- algorithm
- bootstrapping
- character conversion
- character-to-byte substitution
- cluster analysis
- computational linguistics
- cryptography
- databases
- decoder
- decoding
- em learning
- em\/viterbi
- encoding
- expectation-maximization
- human language
- learning
- learning techniques
- letter-substitution
- machine translation
- model bootstrapping
- modeling
- parameter tying
- recognition
- search
- smoothing
- smoothing techniques
- speech recognition
- spelling
- supervised training
- transducers
- tuning
- unsupervised approach
- unsupervised learning
- unsupervised letter-substitution
- viterbi
- viterbi algorithm
- viterbi alignment
- viterbi decoder
- viterbi search
Other assigned terms:
- alphabet
- alphabetic writing
- alphabetic writing system
- analogy
- annotation
- approach
- association for computational linguistics
- bigram
- bigram model
- break
- case
- character bigram model
- character code
- characters
- chunk
- chunks
- cluster
- clusters
- concept
- corpora
- data set
- dictionary
- distribution
- edit distance
- empirical results
- encyclopedia
- english text
- english vocabulary
- error rate
- fact
- feature
- generation
- grammar
- hindi
- hindi decipherment
- implementation
- international phonetic alphabet
- interpolation
- knowledge
- labeling
- lambda
- language models
- linguistics
- linguists
- mappings
- maps
- meaning
- measure
- message
- method
- n-gram
- n-gram model
- n-grams
- names
- natural language
- non-parallel corpora
- parallel text
- parameter settings
- parameter values
- passage
- phoneme
- phoneme sequence
- phonemes
- phonetic alphabet
- probabilistic model
- probabilities
- probability
- procedure
- process
- sentence
- slot
- source language
- spoken language
- syllables
- symbols
- task performance
- technical solution
- technique
- test corpus
- test data
- test set
- text
- theory
- tokens
- toolkit
- training
- trigram
- trigram model
- understanding
- unsmoothed phonetic bigram model
- vocabulary
- vowel
- web page
- word
- words
- writing system