ACL RD-TEC 1.0 Summarization of P06-1070
Paper Title:
EXPLOITING COMPARABLE CORPORA AND BILINGUAL DICTIONARIES FOR CROSS-LANGUAGE TEXT CATEGORIZATION
EXPLOITING COMPARABLE CORPORA AND BILINGUAL DICTIONARIES FOR CROSS-LANGUAGE TEXT CATEGORIZATION
Authors: Alfio Gliozzo and Carlo Strapparava
Primarily assigned technology terms:
- algorithm
- categorization
- classi cation
- collapsing
- computational linguistics
- computer science
- computing
- decomposition
- dimensionality reduction
- disambiguation
- indexing
- information retrieval
- kernel
- knowledge acquisition
- language processing
- latent semantic analysis
- learning
- machine learning
- machine translation
- machines classi cation
- multilingual domain vsm
- natural language processing
- nlp
- processing
- question answering
- semantic analysis
- semantic web
- sense disambiguation
- similarity estimation
- singular value decomposition
- statistical learning
- support vector machines
- taggers
- text categorization
- tuning
- unsupervised technique
- vector space model
- word sense disambiguation
Other assigned terms:
- aligned parallel corpus
- alignment indication
- ambiguity
- approach
- bag of words
- benchmark
- bilingual dictionaries
- bilingual dictionary
- case
- categorization task
- cluster
- clusters
- co-occurrences
- community
- comparable corpora
- concept
- concepts
- corpora
- culture
- device
- dictionaries
- dictionary
- dimensionality
- document
- document frequency
- document vectors
- domain model
- english lexicon
- estimation
- evaluation task
- external knowledge
- fact
- hypothesis
- index
- inverse document frequency
- kernel function
- knowledge
- large corpus
- latent semantic
- lemmata
- lexical resource
- lexical resources
- lexicon
- linguistics
- maps
- meanings
- methodology
- multilingual corpus
- multilinguality
- named entities
- natural language
- news corpus
- nlp applications
- noise
- nouns
- parallel corpora
- parallel corpus
- parallel text
- possible translation
- princeton wordnet
- process
- semantic
- semantic classes
- semantic domain
- semantic relations
- similarity function
- source language
- support vector
- synset
- synsets
- target language
- technique
- term
- terms
- text
- text similarity
- topics
- training
- training examples
- translation pairs
- translations
- vector space
- vocabulary
- word
- word sense
- word senses
- wordnet
- wordnet synsets
- words