ACL RD-TEC 1.0 Summarization of P06-1142
Paper Title:
LEARNING TRANSLITERATION LEXICONS FROM THE WEB
LEARNING TRANSLITERATION LEXICONS FROM THE WEB
Authors: Jin-Shea Kuo and Haizhou Li and Ying-Kuei Yang
Primarily assigned technology terms:
- active learning
- adaptive learning
- algorithm
- approximation
- automatic construction
- automatic speech recognition
- automatic transliteration
- bootstrap
- co-occurrence analysis
- confidence scoring
- crawling
- cross-lingual information retrieval
- cutoff
- database
- databases
- direct orthography mapping
- dynamic programming
- dynamic programming algorithm
- english-chinese transliteration
- expectation-maximization
- grapheme-based syllabification
- harvesting
- incremental learning
- information retrieval
- learning
- learning algorithm
- learning approach
- learning framework
- learning process
- lexicon construction
- lexicon learning
- machine learning
- machine translation
- machine transliteration
- modeling
- nlp
- noisy channel model
- noisychannel modeling
- phoneme-based syllabification
- phonetic mapping
- phonetic similarity modeling
- programming algorithm
- psm adaptation
- querying
- ranking
- re-training
- recognition
- rule-based mapping
- sample selection
- scoring
- search
- search engines
- selection method
- speech recognition
- spelling
- statistical machine translation
- statistical modeling
- supervised learning
- supervised learning approach
- syllabification
- translators
- transliteration
- transliteration modeling
- unsupervised learning
- validation
- validation process
- web crawling
- web search
Other assigned terms:
- acronym
- annotation
- annotation effort
- approach
- bigram
- bilingual lexicon
- bilingual lexicons
- bitext
- case
- characters
- chinese characters
- chinese word
- co-occurrence
- co-occurrence statistics
- community
- comparable bitext
- comparable corpora
- comparable corpus
- comparative study
- conditional independence
- conditional probabilities
- confidence score
- confidence scores
- confusion matrix
- confusion probability
- corpora
- correlation
- estimation
- events
- f-measure
- fact
- foreign word
- generative model
- generative models
- gold standard
- grapheme
- hypothesis
- hypothesis test
- information source
- interpretation
- knowledge
- labeling
- language model
- learning strategy
- lexicon
- likelihood
- likelihood probability
- linear combination
- mapping
- mapping rules
- meanings
- measures
- method
- n-gram
- names
- nlp tasks
- noisy channel
- nouns
- orthography
- phoneme
- phoneme sequence
- phonemes
- phonemic representation
- phonetic similarity
- phonetic similarity model
- phrase
- posterior
- posterior probability
- precision
- probabilistic model
- probabilities
- probability
- procedure
- process
- pronunciation
- proper names
- queries
- query
- seed
- semantic
- sentence
- sentences
- similarity model
- source language
- source-channel model
- statistics
- syllables
- technique
- terms
- text
- training
- training data
- training set
- transformation
- translation model
- translation pairs
- translation problem
- transliteration lexicon
- transliteration model
- web pages
- word
- word frequencies
- words