ACL RD-TEC 1.0 Summarization of W03-1701

Paper Title:
UNSUPERVISED TRAINING FOR OVERLAPPING AMBIGUITY RESOLUTION IN CHINESE WORD SEGMENTATION

Authors: Mu Li and Jianfeng Gao and Chang-Ning Huang and Jianfeng Li

Other assigned terms:

  • ambiguity
  • ambiguous word
  • approach
  • binary classification problem
  • case
  • character sequence
  • characters
  • chinese characters
  • chinese text
  • chinese text corpus
  • chinese word
  • classification problem
  • classification task
  • co-occurrence
  • context feature
  • context features
  • context information
  • context window
  • context words
  • contextual information
  • data set
  • distribution
  • estimation
  • evaluations
  • experimental results
  • fact
  • feature
  • feature set
  • joint probability
  • labeled training data
  • language model
  • lexicon
  • likelihood
  • measures
  • method
  • mutual information
  • natural language
  • open test
  • oracle
  • precision
  • probabilities
  • probability
  • procedure
  • process
  • rule set
  • search space
  • sentence
  • sentences
  • statistical information
  • statistical language model
  • statistics
  • substring
  • support vector
  • test set
  • text
  • text corpus
  • tokens
  • toolkit
  • training
  • training corpus
  • training data
  • training data set
  • training set
  • trigram
  • trigram language model
  • unigram
  • unigram language model
  • unigram probability
  • window size
  • word
  • word boundaries
  • word co-occurrence
  • word sequence
  • word sequences
  • word trigram
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***