ACL RD-TEC 1.0 Summarization of C00-2116

Paper Title:
AUTOMATIC CORPUS-BASED THAI WORD EXTRACTION WITH THE C4.5 LEARNING ALGORITHM

Authors: VIRACH SORNLERTLAMVANICH and TANAPONG POTIPITI and THATSANEE CHAROENPORN

Other assigned terms:

  • alphabet
  • approach
  • characters
  • compounds
  • corpora
  • dictionaries
  • dictionary
  • entropy
  • error rate
  • experimental results
  • explicit word boundary
  • extraction problem
  • extraction process
  • heuristic
  • human judgement
  • information gain
  • japanese language
  • knowledge
  • language processing tasks
  • large corpora
  • leaf
  • lexical entries
  • lexical knowledge
  • linear time
  • measure
  • measures
  • method
  • mutual information
  • n-gram
  • precision
  • probability
  • procedure
  • process
  • processing tasks
  • sentence
  • sentence boundary
  • statistics
  • substring
  • subtree
  • tags
  • test corpus
  • test set
  • thai language
  • thai word
  • training
  • training data
  • training example
  • training set
  • tree
  • word
  • word boundary
  • word frequency
  • word strings
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***