ACL RD-TEC 1.0 Summarization of W04-3229
Paper Title:
A RESOURCE-LIGHT APPROACH TO RUSSIAN MORPHOLOGY: TAGGING RUSSIAN USING CZECH RESOURCES
A RESOURCE-LIGHT APPROACH TO RUSSIAN MORPHOLOGY: TAGGING RUSSIAN USING CZECH RESOURCES
Authors: Jiri Hana and Anna Feldman and Chris Brew
Primarily assigned technology terms:
- algorithm
- ambiguity reduction
- analyzer
- collapsing
- comparative analysis
- database
- encoding
- information retrieval
- language tools
- lexical lookup
- machine translation
- markov model
- modelling
- morphological analysis
- morphological analyzer
- morphological analyzers
- morphological processing
- morphology
- nlp
- parsing
- part-of-speech tagging
- pre-processing
- processing
- reading
- tagger
- taggers
- tagging
- tnt tagger
- viterbi
- viterbi algorithm
- voting
- xerox tagger
Other assigned terms:
- affixation
- alphabet
- ambiguity
- annotated corpora
- annotated corpus
- approach
- case
- constituent order
- corpora
- czech corpus
- data sparsity
- dependency treebank
- derivational morphology
- discourse
- distribution
- error rate
- evaluations
- experimental results
- fact
- free word order
- grammatical functions
- heuristic
- homonymy
- hypotheses
- implementation
- knowledge
- language resources
- lemma
- lemmata
- lexical information
- lexicon
- linguistic
- linguistic intuition
- main verb
- markov models
- method
- morpheme
- morphological category
- morphological lexicon
- multext-east project
- n-gram
- n-gram models
- negation
- nlp tasks
- nouns
- part-of-speech
- parts-of-speech
- penn treebank
- penn treebank tagset
- prague dependency treebank
- precision
- probabilities
- process
- punctuation
- reflexivization
- relation
- roman alphabet
- run-time
- russian
- segments
- sentence
- sentences
- slot
- stem
- stems
- symbols
- syntactic information
- tag set
- tagging task
- tags
- tagset
- test corpus
- test data
- testing corpus
- testing data
- text
- textbook
- tokens
- training
- training corpus
- training data
- transcriptions
- transformation
- transition probabilities
- treebank
- unannotated corpus
- uniform distribution
- verb
- word
- word form
- word order
- words