ACL RD-TEC 1.0 Summarization of P06-1069

Paper Title:
A COMPARISON AND SEMI-QUANTITATIVE ANALYSIS OF WORDS AND CHARACTER-BIGRAMS AS FEATURES IN CHINESE TEXT CATEGORIZATION

Authors: Jingyang Li and Maosong Sun and Xian Zhang

Other assigned terms:

  • ambiguity
  • approach
  • association for computational linguistics
  • bigram
  • category label
  • chinese language
  • chinese text
  • chinese words
  • classification performance
  • classification task
  • complementation
  • dimensionality
  • document
  • document collection
  • document collections
  • encyclopedia
  • fact
  • feature
  • feature list
  • feature selection criterion
  • feature selection scheme
  • feature space
  • implementation
  • information quantity
  • information theory
  • latent semantic
  • linguistics
  • meaning
  • meanings
  • measure
  • method
  • noise
  • part-of-speech
  • performance comparison
  • phrase
  • precision
  • probability
  • probability density
  • processing tasks
  • qualitative analysis
  • scalability
  • semantic
  • semantic information
  • sparseness problem
  • statistics
  • support vector
  • svm implementation
  • tags
  • term
  • term weighting scheme
  • terms
  • test set
  • text
  • text categorization evaluation
  • theory
  • training
  • training data
  • training document
  • training phase
  • training set
  • training time
  • weighting scheme
  • word
  • word features
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***