ACL RD-TEC 1.0 Summarization of W06-3103
Paper Title:
MORPHO-SYNTACTIC ARABIC PREPROCESSING FOR ARABIC TO ENGLISH STATISTICAL MACHINE TRANSLATION
MORPHO-SYNTACTIC ARABIC PREPROCESSING FOR ARABIC TO ENGLISH STATISTICAL MACHINE TRANSLATION
Authors: Anas El Isbihani and Shahram Khadivi and Oliver Bender and Hermann Ney
Primarily assigned technology terms:
- arabic word segmentation
- automaton
- automaton-based approach
- finite state
- finite state automata
- finite state automaton
- ifsa segmentation
- learning
- learning approach
- learning method
- lemmatization
- machine translation
- machine translation system
- morphology
- normalization
- one-to-one mapping
- optimization
- parallel training
- phrase translation
- phrase-based machine translation
- phrase-based translation
- phrase-based translation system
- pos tagging
- preprocessing
- processing
- reasoning
- search
- search process
- segmentation
- segmentation and pos tagging
- segmentation method
- segmenter
- sl segmentation
- smt system
- splitting
- state automaton
- statistical machine translation
- supervised learning
- supervised learning approach
- supervised segmentation
- tagging
- tokenizer
- translation system
- transliteration
- unsupervised learning
- unsupervised learning method
- word segmentation
- word segmentation and pos tagging
Other assigned terms:
- abbreviation
- adjective
- ambiguity
- ambiguous word
- ambiguous words
- approach
- arabic language
- arabic text
- arabic treebank
- automata
- bleu
- btec corpus
- case
- characters
- community
- compound words
- corpora
- decision rule
- derivation
- determiner
- development set
- error rate
- evaluation metrics
- evaluation set
- experimental results
- fact
- heuristic
- inflection
- language model
- large corpora
- lexicon
- lexicon model
- log-linear combination
- mapping
- meaning
- measure
- method
- n-gram
- n-gram language model
- negation
- nist
- normalization factor
- opinions
- parallel training corpus
- phrase
- phrase translation model
- posterior
- posterior probability
- prefixes and suffixes
- probability
- process
- processing time
- pronouns
- pronunciation
- reference translations
- sentence
- sentences
- source language
- source language sentence
- source sentence
- statistics
- stem
- stems
- suffix
- suffixes
- target language
- target language sentence
- technique
- test corpus
- test set
- text
- training
- training corpus
- translation model
- translation models
- translation quality
- translations
- travel expression corpus
- treebank
- verb
- vocabulary
- word
- word error rate
- words