ACL RD-TEC 1.0 Summarization of W05-0802
Paper Title:
CROSS LANGUAGE TEXT CATEGORIZATION BY ACQUIRING MULTILINGUAL DOMAIN MODELS FROM COMPARABLE CORPORA
CROSS LANGUAGE TEXT CATEGORIZATION BY ACQUIRING MULTILINGUAL DOMAIN MODELS FROM COMPARABLE CORPORA
Authors: Alfio Gliozzo and Carlo Strapparava
Primarily assigned technology terms:
- algorithm
- categorization
- classi cation
- computational linguistics
- computer science
- computing
- decomposition
- disambiguation
- feature mapping
- idf term weighting
- information retrieval
- kernel
- latent semantic analysis
- learning
- learning algorithm
- lexical acquisition
- machine translation
- machines classi cation
- mapping function
- multilingual domain vsm
- nlp
- semantic analysis
- sense disambiguation
- similarity estimation
- singular value decomposition
- supervised learning
- support vector machines
- term categorization
- term translation
- term weighting
- text categorization
- unsupervised technique
- vector space model
- weighting
- word sense disambiguation
Other assigned terms:
- adjective
- adverb
- aligned parallel corpus
- alignment indication
- ambiguity
- analogy
- approach
- association for computational linguistics
- bag of words
- bilingual dictionaries
- bilingual dictionary
- case
- categorization problem
- categorization task
- cluster
- clusters
- comparable corpora
- concept
- concepts
- corpora
- culture
- data set
- data sets
- device
- dictionaries
- dictionary
- dimensionality
- document
- document frequency
- document vectors
- domain model
- estimation
- external knowledge
- external knowledge source
- feature
- feature space
- hypothesis
- implementation
- inverse document frequency
- kernel function
- knowledge
- language information
- large corpus
- latent semantic
- latent semantic space
- lemma
- lemmata
- lexical ambiguity
- lexical resource
- lexical resources
- linguistics
- mapping
- maps
- meanings
- multilingual corpus
- named entities
- news corpus
- nlp applications
- nouns
- parallel corpora
- parallel corpus
- parallel texts
- parameter settings
- part of speech
- parts of speech
- positive and negative examples
- possible translation
- priori
- process
- relation
- schema
- semantic
- semantic domain
- semantic space
- sentences
- similarity function
- similarity metrics
- similarity score
- source language
- support vector
- svm implementation
- svms
- target language
- technique
- term
- terms
- test set
- text
- text categorization problem
- text similarity
- tokens
- topics
- training
- translation pair
- translation pairs
- translations
- vector space
- verb
- vocabulary
- word
- word sense
- word-net
- words