ACL RD-TEC 1.0 Summarization of C00-2095
Paper Title:
A FORMALISM FOR UNIVERSAL SEGMENTATION OF TEXT
A FORMALISM FOR UNIVERSAL SEGMENTATION OF TEXT
Primarily assigned technology terms:
- algorithm
- automatic segmentation
- automaton
- computing
- disambiguation
- disambiguation method
- document representation
- encoding
- final state
- finite-state automata
- finite-state automaton
- finite-state calculus
- finite-state transducer
- identification
- morphological analysis
- morphology
- nlp
- nlp system
- nlp systems
- part of speech tagging
- part-of-speech tagging
- processing
- pruning
- segmentation
- segmenter
- sentence segmentation
- shortest path
- sonle segmentation
- speech tagger
- speech tagging
- tagger
- tagging
- tile
- tile processing
- tokenization
- tokenization system
- tokenizer
- transducer
- word segmentation
Other assigned terms:
- ambiguity
- annotation
- annotation graphs
- approach
- array
- automata
- bigram
- bigram model
- case
- characters
- chinese characters
- composition
- dictionaries
- dictionary
- document
- feature
- finite-state structure
- formalism
- french
- heuristic
- heuristics
- implementation
- input string
- input text
- lexicon
- linguistic
- linguistic data
- mapping
- maps
- meaning
- method
- n-grams
- nouns
- paragraphs
- part of speech
- part of speech tags
- part-of-speech
- process
- punctuation
- punctuation marks
- regular expressions
- relation
- segmentation level
- semantic
- semantic information
- sentence
- sentences
- statistics
- subgraph
- surface form
- symbols
- tags
- technique
- terms
- text
- tile graph
- tokens
- transformation
- vocabulary
- word
- word level
- words
- xml format