ACL RD-TEC 1.0 Summarization of W06-0134

Paper Title:
A PRAGMATIC CHINESE WORD SEGMENTATION SYSTEM

Authors: Wei Jiang and Yi Guan and Xiao-Long Wang

Other assigned terms:

  • ambiguous segmentation
  • ambiguous words
  • annotated corpora
  • association for computational linguistics
  • case
  • character sequence
  • characters
  • chinese language
  • chinese word
  • conditional probability
  • context feature
  • context features
  • corpora
  • data structure
  • dictionaries
  • dictionary
  • entity recognition module
  • entity recognition task
  • entropy
  • estimation
  • f score
  • feature
  • fmeasure
  • generative rule
  • information gain
  • lattice
  • lexicon
  • linguist
  • linguistic
  • linguistic features
  • linguistics
  • measure
  • method
  • mutual information
  • n-grams
  • named entities
  • named entity
  • natural language
  • nlp tasks
  • open test
  • out-of-vocabulary word
  • precision
  • probability
  • recognition module
  • recognition task
  • regular expressions
  • segmentation bakeoff
  • semantic
  • sentence
  • sentences
  • sparse data
  • sparse data problem
  • statistics
  • suffix
  • symbol
  • syntactic unit
  • system description
  • system performance
  • tags
  • target word
  • terms
  • test corpus
  • training
  • training corpora
  • training corpus
  • training data
  • trigram
  • trigram model
  • word
  • word boundaries
  • word lattice
  • word sequence
  • word-based model
  • words
  • xml format

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***