ACL RD-TEC 1.0 Summarization of W04-0404
Paper Title:
TRANSLATION BY MACHINE OF COMPLEX NOMINALS: GETTING IT RIGHT
TRANSLATION BY MACHINE OF COMPLEX NOMINALS: GETTING IT RIGHT
Authors: Timothy Baldwin and Takaaki Tanaka
Primarily assigned technology terms:
- bilingual bootstrapping
- binary classification
- bootstrapping
- candidate generation
- candidate selection
- candidate selection method
- classification
- classifier
- compositional method
- corpus-based translation
- cross-validation
- direct translation
- japanese machine translation
- kernel
- learner
- likelihood estimate
- linear interpolation
- machine translation
- maximum likelihood
- modelling
- monolingual selection
- morphology
- mt system
- mt systems
- parser
- processing
- random selection
- ranking
- rating
- selection method
- statistical mt
- support vector machine
- translation selection
- translator
- translators
- word alignment
Other assigned terms:
- annotation
- annotator
- benchmark
- bias
- bilingual dictionaries
- bilingual dictionary
- british national corpus
- bunsetsu
- case
- chunks
- compositionality
- compound nominal
- compounds
- conditional probability
- contextual similarity
- corpora
- corpus evidence
- corpus frequency
- data set
- derivational morphology
- dictionaries
- dictionary
- dictionary data
- distribution
- edict dictionary
- english translations
- english web
- entropy
- f-score
- fact
- feature
- feature types
- feature value
- feature vector
- feature vectors
- generation
- hypothesis
- implementation
- interpolation
- knowledge
- language corpus
- language expression
- language model
- lexical specification
- likelihood
- maximum likelihood estimate
- method
- methodology
- monolingual corpus
- morph
- multiword expressions
- mwes
- nn compound
- nominals
- noun phrases
- nouns
- parallelism
- paraphrase
- paraphrases
- part of speech
- prepositions
- probabilities
- probability
- procedure
- punctuation
- punctuation mark
- random sample
- relation
- relative frequency
- reuters corpus
- sentence
- sentence boundary
- shimbun corpus
- slot
- source language
- statistics
- subcategorisation
- support vector
- svms
- system performance
- tag sequence
- target language
- target language corpus
- target language model
- terms
- test data
- token frequency
- training
- training data
- translation accuracy
- translation candidate
- translation candidates
- translation model
- translation pairs
- translation probabilities
- translation quality
- translation task
- translation template
- translations
- web pages
- word
- words