ACL RD-TEC 1.0 Summarization of C04-1102
Paper Title:
DETECTING TRANSLITERATED ORTHOGRAPHIC VARIANTS VIA TWO SIMILARITY METRICS
DETECTING TRANSLITERATED ORTHOGRAPHIC VARIANTS VIA TWO SIMILARITY METRICS
Authors: Kiyonori Ohtake and Youichi Sekiguchi and Kazuhide Yamamoto
Primarily assigned technology terms:
- analyzer
- approximation
- dependency analyzer
- detection method
- distance function
- japanese romanization
- katakana transliteration
- language processing
- machine transliteration
- matching
- morphology
- natural language processing
- phonetic spelling
- processing
- regular expression
- romanization
- searching
- spelling
- terminology
- transliteration
- vector space model
- weighting
Other assigned terms:
- approach
- back-transliteration
- characters
- chinese characters
- context vector
- context vectors
- contextual information
- contextual similarity
- corpora
- data sparseness
- data sparseness problem
- dependency structure
- dictionaries
- dictionary
- edit distance
- experimental results
- f-measure
- foreign word
- foreign words
- japanese corpus
- kanji
- katakana
- language pairs
- language resources
- large corpus
- length frequency
- measures
- method
- morphemes
- multilingual corpus
- names
- natural language
- nouns
- open test
- orthography
- parameter settings
- particle
- perplexity
- phonemes
- precision
- procedure
- process
- pronunciation
- proper noun
- sentence
- sentences
- similarity metrics
- size of the corpus
- slang
- sparseness problem
- string similarity
- structure of a sentence
- target language
- technical terminology
- technology
- test set
- text
- theory
- travel expression corpus
- vector space
- verb
- vocabulary
- vowel
- weighted edit distance
- word
- words