ACL RD-TEC 1.0 Summarization of P98-1076
Paper Title:
ONE TOKENIZATION PER SOURCE
ONE TOKENIZATION PER SOURCE
Primarily assigned technology terms:
- algorithm
- ambiguity resolution
- bracketing
- decision making
- disambiguation
- finitestate
- language processing
- morpho-syntactic parsing
- natural language processing
- parsing
- part-of-speech tagging
- processing
- random selection
- scoring
- scoring function
- sentence tokenization
- structural analysis
- tagging
- tokenization
- tokenization disambiguation
- transducer
- unigram scoring
- validation
Other assigned terms:
- ambiguity
- case
- characters
- chinese characters
- collocation
- compounds
- corpora
- dictionary
- dictionary entries
- discourse
- english dictionary
- english sentence
- fact
- formalisms
- grammar
- hypotheses
- hypothesis
- implementation
- large corpus
- linguistic
- linguistic expression
- linguistic expressions
- linguistic phenomena
- local constraints
- morphemes
- mutual information
- natural language
- part-of-speech
- ph corpus
- phrase
- procedure
- proposition
- representation framework
- representative corpora
- semantic
- sentence
- sentences
- sentential context
- sparse data
- sparse data problem
- terms
- theorem
- theories
- theory
- tokens
- tree
- tree-like structure
- unigram
- word
- words