ACL RD-TEC 1.0 Summarization of H92-1041
Paper Title:
FEATURE SELECTION AND FEATURE EXTRACT ION FOR TEXT CATEGORIZATION
FEATURE SELECTION AND FEATURE EXTRACT ION FOR TEXT CATEGORIZATION
Primarily assigned technology terms:
- binary categorization
- bracketing
- broadcast information service
- capitalization
- categorization
- classification
- classifier
- classifiers
- cluster analysis
- clustering
- clustering method
- database
- decision tree
- decision tree induction
- document retrieval
- document routing
- feature clustering
- feature extraction
- feature selection
- indexing
- induction
- language processing
- language processing systems
- natural language processing
- natural language processing systems
- neighbor clustering
- nlp
- noun phrase bracketing
- phrase bracketing
- phrase clustering
- phrase indexing
- predictor
- processing
- pruning
- query interpretation
- reasoning
- scoring
- statistical approaches
- statistical methods
- statistical techniques
- statistical text categorization
- statistical training
- syntactic analysis
- syntactic phrase indexing
- tagger
- term clustering
- text categorization
- text categorization method
- text categorization system
- text classification
- text representation
- text retrieval
- tree induction
- word-based indexing
- word-based representation
Other assigned terms:
- abbreviations
- ambiguity
- approach
- binary features
- case
- classification task
- classification tasks
- cluster
- clusters
- correlation
- data set
- data sets
- decision theory
- dimensionality
- document
- fact
- feature
- feature set
- feature sets
- function words
- index
- interpretation
- knowledge
- large training
- manual indexing
- meaning
- measure
- measures
- method
- muc-3
- muc-3 corpus
- muc-3 testset
- mutual information
- natural language
- natural language texts
- noun phrase
- noun phrases
- nouns
- parse
- phrase
- phrase formation
- polysemy
- precision
- prior probability
- probabilistic model
- probabilistic models
- probabilities
- probability
- probability estimates
- procedure
- punctuation
- query
- representations
- reuters corpus
- reuters data set
- semantic
- semantic relationships
- set size
- similarity metric
- standard decision tree
- statistical model
- synonymy
- syntactic class
- syntactic context
- syntactic parse
- syntactic phrase
- tags
- term
- terms
- test set
- text
- text classification task
- theory
- tokens
- training
- training corpus
- training data
- training documents
- training examples
- training set
- transcripts
- translations
- tree
- word
- words