ACL RD-TEC 1.0 Summarization of J04-2003
Paper Title:
STATISTICAL MACHINE TRANSLATION WITH SCARCE RESOURCES USING MORPHO-SYNTACTIC INFORMATION
STATISTICAL MACHINE TRANSLATION WITH SCARCE RESOURCES USING MORPHO-SYNTACTIC INFORMATION
Primarily assigned technology terms:
- algorithm
- alignment algorithm
- alignment process
- analyzer
- automatic alignment
- automatic alignment algorithm
- automatic translation
- classification
- complex reasoning
- computational linguistics
- corpus-based machine translation
- decomposition
- disambiguation
- document classification
- feature selection
- identification
- information retrieval
- iterative training
- language modeling
- large-vocabulary speech recognition
- learning
- lexical analysis
- linear interpolation
- machine translation
- machine translation system
- machine translation systems
- matching
- maximum-entropy
- modeling
- morpho-syntactic analysis
- morphological analyzers
- morphology
- multilingual information retrieval
- preprocessing
- processing
- question inversion
- reading
- reasoning
- recognition
- recognition systems
- recognizer
- sequence translation
- smoothing
- speech recognition
- speech recognition systems
- speech recognizer
- spelling
- statistical machine translation
- statistical machine translation system
- statistical translation
- subcategorization
- syntactic analysis
- syntactic disambiguation
- training procedure
- translation algorithm
- translation modeling
- translation process
- translation system
- translation systems
- word alignment
- word matching
Other assigned terms:
- adjective
- adverb
- affixes
- alignment template
- ambiguity
- ambiguous words
- annotation
- appointment scheduling
- approach
- association for computational linguistics
- bigram
- bilingual corpora
- bilingual corpus
- bilingual dictionary
- bilingual training corpus
- bleu
- brevity penalty
- case
- case information
- class hierarchy
- co-occurrences
- compound words
- compounding
- concept
- context information
- corpora
- corpus size
- data sparseness
- data sparseness problem
- dative case
- determiner
- determiners
- development set
- dialogues
- dictionaries
- dictionary
- dictionary entries
- distribution
- document
- edit distance
- english corpus
- english language
- english sentence
- english translations
- entropy
- error rate
- events
- experimental results
- fact
- feature
- finite verb
- french
- generation
- geometric mean
- german corpus
- grammatical category
- heuristics
- hierarchical lexicon
- hypothesis
- ibm model
- ibm models
- inflected form
- inflected forms
- inflection
- intelligibility
- interpolation
- interpretation
- knowledge
- language corpus
- language model
- language pair
- language use
- large training
- large training corpora
- large-vocabulary speech
- lemma
- lemma-tag representation
- lexica
- lexical coverage
- lexical translation
- lexicon
- lexicon model
- linguistic
- linguistic knowledge
- linguistics
- log-linear combination
- log-linear model
- mapping
- maximum-entropy model
- meaning
- measure
- measures
- method
- model parameters
- monolingual corpora
- monolingual corpus
- morpho-syntactic information
- morphological knowledge
- multilingual information
- n-gram
- n-gram model
- n-grams
- names
- notational simplicity
- nouns
- parallel corpora
- parallel corpus
- parallel text
- part of speech
- penalty factor
- perplexity
- phrase
- precision
- preposition
- probabilistic lexicon
- probabilities
- probability
- probability model
- procedure
- process
- pronoun
- pronunciation
- proper name
- proper names
- reference translations
- relation
- semantic
- semantic information
- sentence
- sentence structure
- sentence-level restructuring
- sentences
- source language
- source language word
- sparseness problem
- statistical framework
- statistical language model
- statistics
- stems
- syntactic function
- syntactic information
- tag sequence
- tags
- target language
- target language corpus
- target language model
- target languages
- terms
- test corpus
- test set
- text
- training
- training corpora
- training corpus
- training data
- training set
- translation candidates
- translation direction
- translation hypothesis
- translation lexicon
- translation model
- translation models
- translation pairs
- translation probabilities
- translation probability
- translation quality
- translation task
- translations
- trigram
- unigram
- verb
- verb form
- verb number
- verbmobil corpus
- vocabulary
- vocabulary size
- word
- word alignments
- word error rate
- word form
- word order
- word sequence
- words