ACL RD-TEC 1.0 Summarization of P04-1067
Paper Title:
A GEOMETRIC VIEW ON BILINGUAL LEXICON EXTRACTION FROM COMPARABLE CORPORA
A GEOMETRIC VIEW ON BILINGUAL LEXICON EXTRACTION FROM COMPARABLE CORPORA
Authors: Eric Gaussier and J.M. Renders and I. Matveeva and C. Goutte and H. Dejean
Primarily assigned technology terms:
- algorithm
- bilingual lexicon extraction
- bootstrap
- canonical correlation analysis
- clustering
- computing
- correlation analysis
- cross-lingual information retrieval
- crosslanguage information retrieval
- decomposition
- em algorithm
- encoding
- expectation-maximisation
- extraction system
- fisher kernel
- indexing
- information retrieval
- information retrieval system
- information retrieval systems
- kernel
- kernels
- latent semantic analysis
- lemmatisation
- lexicon extraction
- nearest neighbors
- pos-tagging
- preprocessing
- principal component analysis
- processing
- pruning
- query expansion
- re-ranking
- retrieval system
- retrieval systems
- search
- search method
- searching
- semantic analysis
- soft clustering
- spelling
- statistical technique
- tokenisation
- validation
Other assigned terms:
- approach
- association measure
- bilingual dictionary
- bilingual lexicon
- bilingual lexicons
- canonical correlation
- case
- cluster
- clusters
- co-occurrence
- co-occurrences
- community
- comparable corpora
- comparable corpus
- context vector
- context vectors
- context window
- context words
- corpora
- correlation
- cosine measure
- derivations
- dice
- dictionaries
- dictionary
- dictionary entries
- distribution
- empirical evaluation
- feature
- french
- generation
- generative model
- interpretation
- latent semantic
- lexicon
- likelihood
- linear combination
- linguistic
- mapping
- meaning
- meanings
- measure
- measures
- memory space
- method
- mutual information
- noise
- norm
- nouns
- pairs of words
- parallel corpora
- polysemous words
- polysemy
- precision
- probabilistic approach
- probabilistic model
- probabilities
- probability
- process
- processing time
- projection
- query
- rank order
- relation
- representations
- seed
- semantic
- semantic representations
- similarity between words
- similarity measure
- similarity measures
- source language
- synonyms
- synonymy
- target languages
- target word
- technique
- term
- term-document matrix
- terms
- test set
- tokens
- training
- training set
- translation pair
- translation pairs
- translations
- vector space
- vocabulary
- word
- words