ACL RD-TEC 1.0 Summarization of W04-3244
Paper Title:
LEARNING NONSTRUCTURAL DISTANCE METRIC BY MINIMUM CLUSTER DISTORTION
LEARNING NONSTRUCTURAL DISTANCE METRIC BY MINIMUM CLUSTER DISTORTION
Authors: Daichi Mochihashi and Genichiro Kikui and Kenji Kita
Primarily assigned technology terms:
- 5-fold cross validation
- classification
- classifiers
- cluster recovery
- clustering
- cross validation
- cross-validation
- dialogue translation
- dialogue translation technology
- dimensionality reduction
- distance function
- document clustering
- document retrieval
- feature weighting
- fisher kernel
- global optimization
- hierarchical clustering
- indexing
- induction
- information retrieval
- k-means
- k-means clustering
- kernel
- language processing
- latent semantic indexing
- learning
- machine learning
- matching
- natural language processing
- optimization
- paraphrasing
- pattern recognition
- preprocessing
- processing
- question answering
- random sampling
- random selection
- recognition
- sampling
- semantic indexing
- sentence retrieval
- spectral clustering
- speech dialogue translation
- support vector machines
- text classification
- translation technology
- validation
- vector-based language processing
- weighting
Other assigned terms:
- approach
- bag of words
- case
- classification task
- cluster
- cluster centroid
- clusters
- concept
- correlation
- correlations
- cosine distance
- cosine similarity
- dimensionality
- distance metric
- distribution
- document
- document length
- euclidean distance
- feature
- feature space
- feature vector
- feature vectors
- feature weights
- heuristic
- japanese sentences
- kernel function
- large corpus
- latent semantic
- lexicon
- linear algebra
- linguistic
- linguistic data
- linguistic expressions
- linguistic features
- measure
- method
- natural language
- noise
- norm
- paragraphs
- part-of-speech
- part-of-speech tags
- precision
- probability
- query
- retrieval task
- seed
- semantic
- sentence
- sentences
- statistics
- support vector
- svms
- tags
- technology
- term
- test data
- text
- text classification task
- theorem
- training
- training data
- training documents
- translations
- transposition
- vector space
- words