ACL RD-TEC 1.0 Summarization of C04-1017
Paper Title:
SPLITTING INPUT SENTENCE FOR MACHINE TRANSLATION USING LANGUAGE MODEL WITH SENTENCE SIMILARITY
SPLITTING INPUT SENTENCE FOR MACHINE TRANSLATION USING LANGUAGE MODEL WITH SENTENCE SIMILARITY
Authors: Takao Doi and Eiichiro Sumita
Primarily assigned technology terms:
- algorithm
- clustering
- corpus-based machine translation
- dialogue translation
- dialogue translation technology
- dp-match driven transducer
- english-to-japanese translation
- example-based machine translation
- hierarchical phrase alignment-based translator
- hpat
- instantiation
- machine translation
- measuring
- mt system
- mt systems
- multilingual translation
- parsing
- ranking
- retrieval procedure
- retrieving
- search
- search algorithm
- sentence simplification
- sentence splitting
- speech dialogue translation
- speech translation
- splitting
- splitting method
- transducer
- translation technology
- translator
Other assigned terms:
- bleu
- bleu score
- case
- content words
- conversation
- conversation corpus
- corpora
- dialogues
- english translations
- error rate
- evaluations
- experimental results
- head word
- japanese sentences
- knowledge
- language model
- large corpus
- linguistic
- linguistic resources
- measure
- measures
- method
- multi-reference word
- n-gram
- n-gram language model
- n-grams
- nist
- parallel corpora
- parallel corpus
- part of speech
- perplexity
- phrase
- precision
- probabilities
- probability
- procedure
- process
- reference translations
- semantic
- semantic distance
- sentence
- sentence similarity
- sentences
- similarity definition
- similarity measure
- source text
- spoken language
- statistics
- technique
- technology
- term
- test set
- text
- thesaurus
- training
- training corpus
- transcriptions
- translation knowledge
- translation quality
- translations
- travel expression corpus
- trees
- trigram
- trigram language model
- trigram model
- word
- word count
- word error rate
- word sequences
- word trigram
- word trigram model
- words