ACL RD-TEC 1.0 Summarization of J96-1001
Paper Title:
TRANSLATING COLLOCATIONS FOR BILINGUAL LEXICONS: A STATISTICAL APPROACH
TRANSLATING COLLOCATIONS FOR BILINGUAL LEXICONS: A STATISTICAL APPROACH
Authors: Frank Smadja and Vasileios Hatzivassiloglou and Kathleen R. McKeown
Primarily assigned technology terms:
- algorithm
- analyzer
- bilingual lexicography
- categorization
- compiler
- computational lexicography
- computational linguistics
- computer science
- computing
- database
- databases
- direct translation
- disambiguation
- extraction systems
- full machine translation
- identification
- indexing
- information extraction
- information extraction systems
- information retrieval
- information retrieval system
- information retrieval systems
- jaccard coefficient
- language acquisition
- language processing
- machine translation
- machine translation systems
- matching
- message understanding
- morphological analyzer
- mt system
- multilingual generation
- multilingual information retrieval
- multilingual summarization
- natural language processing
- parser
- part-of-speech tagger
- postprocessing
- preprocessing
- processing
- retrieval system
- retrieval systems
- sampling
- scoring
- scoring method
- search
- searching
- second language acquisition
- sense disambiguation
- sentence alignment
- sentence alignment program
- statistical analysis
- statistical approaches
- statistical methods
- statistical natural language processing
- statistical techniques
- sublanguage translation
- summarization
- summarization process
- summarization system
- tagger
- tagging
- terminology
- text categorization
- tokenizer
- translation algorithm
- translation systems
- translator
- word ordering
- word translation
Other assigned terms:
- aligned corpus
- alignment error rate
- ambiguous words
- american english
- approach
- argumentation
- association for computational linguistics
- bilingual corpora
- bilingual corpus
- bilingual dictionary
- bilingual lexicon
- bilingual lexicons
- break
- british english
- candidate translation
- case
- co-occurrence
- coefficient
- collocation
- community
- complex sentence
- compound noun
- compounds
- concepts
- conditional probabilities
- corpora
- correlation
- derivation
- dice
- dice coefficient
- dictionary
- disk
- distribution
- ellipsis
- english corpus
- english translation
- entropy
- error rate
- estimation
- evaluation methodology
- evaluations
- events
- experimental results
- fact
- feature
- french
- french corpus
- french translation
- french word
- generation
- generative capacity
- heuristic
- human translation
- implementation
- index
- inflected forms
- interlingua
- joint probability
- knowledge
- large corpus
- lexical information
- lexicography
- lexicon
- linear time
- linguist
- linguistics
- literal translation
- mapping
- mathematical model
- meaning
- meanings
- measure
- measures
- message
- method
- methodology
- multilingual information
- multiword expressions
- mutual information
- natural language
- natural languages
- noise
- noun phrases
- pairs of words
- paragraphs
- parallel bilingual corpus
- parallel corpora
- part-of-speech
- part-of-speech information
- phrase
- polysemy
- portability
- precision
- preposition
- probabilities
- probability
- process
- query
- relation
- representations
- search query
- search space
- semantic
- semantic features
- semantic rules
- sentence
- sentences
- similarity measure
- similarity measures
- similarity score
- similarity scores
- size of the corpus
- source language
- sparse data
- statistical natural language
- statistics
- sublanguage
- syntactic constituents
- syntactic relation
- target language
- target languages
- technical terminology
- technique
- technology
- term
- terms
- text
- text corpora
- training
- training corpora
- training data
- transcripts
- translations
- understanding
- uniform distribution
- user
- verb
- word
- word collocation
- word level
- word order
- word pair
- word-by-word basis
- words