ACL RD-TEC 1.0 Summarization of W05-0606
Paper Title:
COMPUTING WORD SIMILARITY AND IDENTIFYING COGNATES WITH PAIR HIDDEN MARKOV MODELS
COMPUTING WORD SIMILARITY AND IDENTIFYING COGNATES WITH PAIR HIDDEN MARKOV MODELS
Authors: Wesley Mackay and Grzegorz Kondrak
Primarily assigned technology terms:
- algorithm
- baum-welch algorithm
- binary classification
- biological sequence analysis
- classification
- cognate identification
- computational linguistics
- computational natural language learning
- computing
- correction method
- cross-lingual information retrieval
- cutoff
- finite-state transducers
- forward-backward algorithm
- hidden markov
- hidden markov model
- hidden markov models
- hmm-based algorithm
- identification
- identification process
- induction
- information retrieval
- language learning
- language processing
- learning
- length correction
- levenshtein
- machine translation
- markov model
- modeling
- natural language learning
- natural language processing
- orthographic representation
- pairwise alignment
- processing
- ranking
- recognition
- scoring
- segmentation
- sentence alignment
- sequence analysis
- speech recognition
- spelling
- splitting
- statistical machine translation
- string segmentation
- supervised training
- transducer
- transducers
- viterbi
- viterbi algorithm
- word aligner
- word alignment
- word length correction
Other assigned terms:
- alignment model
- alignment task
- alphabet
- approach
- association for computational linguistics
- bias
- biological sequence
- bitext
- case
- characters
- co-occurrence
- co-occurrence statistics
- concepts
- data corpus
- data set
- dictionary
- domain-specific knowledge
- edit distance
- experimental results
- fact
- french
- indoeuropean data corpus
- knowledge
- language pair
- language pairs
- large training
- lcsr
- levenshtein distance
- lexicography
- likelihood
- linguistics
- markov models
- meaning
- meanings
- measure
- measures
- method
- n-grams
- names
- natural language
- pairs of words
- phoneme
- phonemes
- phonetic representation
- phonetic similarity
- precision
- probabilities
- probability
- process
- recognition model
- recognition precision
- recognition task
- roman alphabet
- russian
- sentence
- similarity measures
- similarity model
- similarity score
- statistical significance
- statistics
- string edit distance
- string similarity
- symbol
- symbols
- synonyms
- synonymy
- technique
- test data
- test set
- testing data
- trained model
- training
- training and testing data
- training data
- training set
- transformation
- transition probabilities
- transition probability
- translation dictionary
- word
- word lists
- word pair
- word similarity
- words