ACL RD-TEC 1.0 Summarization of W06-2604
Paper Title:
A MULTICLASSIFIER BASED DOCUMENT CATEGORIZATION SYSTEM: PROFITING FROM THE SINGULAR VALUE DECOMPOSITION DIMENSIONALITY REDUCTION TECHNIQUE
A MULTICLASSIFIER BASED DOCUMENT CATEGORIZATION SYSTEM: PROFITING FROM THE SINGULAR VALUE DECOMPOSITION DIMENSIONALITY REDUCTION TECHNIQUE
Authors: Ana Zelaia and IƱaki Alegria and Olatz Arregi and Basilio Sierra
Primarily assigned technology terms:
- algorithm
- bagging
- binary classification
- binary text categorization
- boosting
- categorization
- classification
- classification algorithm
- classification approach
- classification method
- classification system
- classifier
- classifiers
- database
- databases
- decomposition
- dimension reduction
- dimensionality reduction
- dimensionality reduction technique
- document categorization
- document representation
- document representation and classification
- factoring
- indexing
- information retrieval
- information retrieval tasks
- k-nn
- k-nn classification
- latent semantic indexing
- learning
- learning algorithm
- machine learning
- measuring
- nearest neighbors
- neighbor classification
- parameter setting
- porter stemmer
- preprocessing
- ranking
- semantic indexing
- singular value decomposition
- stemmer
- svd dimensionality reduction
- text categorization
- text categorization system
- tuning
- vector representation
- vector space model
- voting
- voting scheme
Other assigned terms:
- approach
- basque
- benchmark
- case
- categorization task
- category label
- cosine similarity
- cosine similarity measure
- dimensionality
- distribution
- document
- document collection
- document collections
- document vector
- document vectors
- evaluation measures
- experimental results
- fact
- index
- information organization
- latent semantic
- measure
- measures
- method
- natural language
- natural language texts
- parameter settings
- polysemy
- precision
- process
- projection
- punctuation
- relation
- reuters-21578 standard document collection
- semantic
- semantic similarity
- semantic structure
- semantic structures
- similarity measure
- stems
- synonymy
- technique
- term
- term-document matrix
- testing set
- text
- text documents
- topics
- training
- training database
- training dataset
- training document
- training documents
- training examples
- training set
- vector space
- words