ACL RD-TEC 1.0 Summarization of W04-0412
Paper Title:
NON-CONTIGUOUS WORD SEQUENCES FOR INFORMATION RETRIEVAL
NON-CONTIGUOUS WORD SEQUENCES FOR INFORMATION RETRIEVAL
Authors: Antoine Doucet and Helana Ahonen-Myka
Primarily assigned technology terms:
- algorithm
- approximation
- clustering
- computing
- document management
- document modeling
- document retrieval
- extraction technique
- feature selection
- greedy algorithm
- indexing
- information retrieval
- k-means
- linear interpolation
- matching
- mining
- modeling
- modelling
- normalization
- preprocessing
- processing
- retrieval system
- scoring
- searching
- splitting
- terminology
- text mining
- vector space model
- weighting
- word ordering
Other assigned terms:
- adjective
- approach
- bag of words
- case
- characters
- clusters
- co-occurrences
- coefficient
- concept
- cosine similarity
- cosine similarity measure
- document
- document collection
- document collections
- document frequency
- document vectors
- euclidean distance
- evaluation measure
- extraction process
- fact
- feature
- feature sets
- grammar
- index
- interpolation
- inverted document frequency
- keyphrase
- knowledge
- logical structure
- mapping
- meaning
- measure
- measures
- method
- modifier
- multilingual document
- multiword expressions
- natural language
- norm
- paragraphs
- parts-of-speech
- patent
- phrase
- precision
- prepositions
- priori
- procedure
- process
- queries
- query
- representations
- retrieval performance
- sentence
- similarity measure
- stem
- stems
- tag sequence
- technique
- term
- term frequency
- terms
- text
- textual information
- user
- vector space
- verb
- weighting scheme
- word
- word features
- word pair
- word sequences
- words
- xml document