ACL RD-TEC 1.0 Summarization of P06-1068
Paper Title:
A STUDY ON AUTOMATICALLY EXTRACTED KEYWORDS IN TEXT CATEGORIZATION
A STUDY ON AUTOMATICALLY EXTRACTED KEYWORDS IN TEXT CATEGORIZATION
Authors: Anette Hulth and Beáta B. Megyesi
Primarily assigned technology terms:
- algorithm
- automatic keyword extraction
- automatic keyword indexing
- automatic summarization
- automatic text categorization
- automatic text summarization
- categorization
- classification
- classifier
- computational linguistics
- cross validation
- data representation
- dimensionality reduction
- extraction algorithm
- extraction procedure
- extraction system
- extrinsic evaluation
- extrinsic evaluation method
- feature selection
- indexer
- indexing
- information retrieval
- keyword extraction
- keyword indexing
- learning
- learning algorithm
- learning method
- linguistic analysis
- machine learning
- matching
- one learning
- parameter tuning
- paraphrasing
- pre-processing
- regression
- sentence extraction
- splitting
- stemmer
- summarization
- supervised machine learning
- support vector machines
- term selection
- terminology
- terminology extraction
- text categorization
- text classification
- text representation
- text summarization
- tuning
- validation
Other assigned terms:
- approach
- association for computational linguistics
- candidate term
- candidate terms
- categorization task
- chunks
- compounds
- data set
- dimensionality
- document
- document frequency
- evaluation measures
- evaluation method
- evaluations
- extraction process
- f-measure
- feature
- feature value
- feature vectors
- gold standard
- heuristics
- implementation
- information gain
- information source
- inverse document frequency
- keyword
- linguistic
- linguistics
- mapping
- meaning
- measure
- measures
- method
- n-grams
- nominals
- noun phrase
- noun phrases
- parameter settings
- parts-of-speech
- phrase
- pos tag
- precision
- prediction model
- probabilities
- procedure
- process
- representations
- reuters corpus
- sentence
- sentences
- stem
- stems
- support vector
- synonyms
- tags
- term
- term frequency
- terms
- test data
- test data set
- test set
- text
- tf \* idf
- tokens
- training
- training data
- training data set
- training phase
- training set
- unigram
- vector space
- web pages
- word
- words