ACL RD-TEC 1.0 Summarization of W97-0120

Paper Title:
A SELF-ORGANIZING JAPANESE WORD SEGMENTER USING HEURISTIC WORD IDENTIFICATION AND RE-ESTIMATION

Other assigned terms:

  • alphabet
  • ambiguity
  • approach
  • array
  • bigram
  • break
  • character sequence
  • character type
  • characters
  • chinese characters
  • chinese word
  • community
  • corpora
  • data structure
  • dictionaries
  • dictionary
  • distribution
  • estimation
  • evaluation measures
  • f-measure
  • fact
  • foreign words
  • function words
  • grammatical function
  • heuristic
  • heuristic rule
  • heuristics
  • human intervention
  • hypotheses
  • hypothesis
  • input string
  • japanese corpus
  • japanese sentences
  • japanese text
  • japanese word
  • joint probability
  • kanji
  • katakana
  • language data
  • language model
  • large training
  • lexical rules
  • lexicon
  • linguistics
  • manual segmentation
  • meaning
  • measures
  • method
  • n-gram
  • names
  • nlp application
  • out-of-vocabulary rate
  • part of speech
  • partial parses
  • particles
  • personal names
  • phrase
  • plural noun
  • poisson distribution
  • precision
  • probabilities
  • probability
  • procedure
  • process
  • pronunciation
  • punctuation
  • relation
  • roman alphabet
  • seed
  • segmentation accuracy
  • segmented corpus
  • semantic
  • sentence
  • sentences
  • speech tag
  • statistical language model
  • substring
  • suffix
  • term
  • terms
  • text
  • tokens
  • training
  • training corpus
  • training set
  • training text
  • unigram
  • unigram model
  • unknown word model
  • word
  • word boundaries
  • word boundary
  • word formation
  • word frequencies
  • word frequency
  • word lists
  • word model
  • word segmentation accuracy
  • word sequence
  • word types
  • word-based language model
  • word-based statistical language model
  • words
  • writing system

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***