ACL RD-TEC 1.0 Summarization of C04-1117
Paper Title:
COGNATE MAPPING - A HEURISTIC STRATEGY FOR THE SEMI-SUPERVISED ACQUISITION OF A SPANISH LEXICON FROM A PORTUGUESE SEED LEXICON
COGNATE MAPPING - A HEURISTIC STRATEGY FOR THE SEMI-SUPERVISED ACQUISITION OF A SPANISH LEXICON FROM A PORTUGUESE SEED LEXICON
Authors: Stefan Schulz and Kornel Markó and Eduardo Sbrissia and Percy Nohama and Udo Hahn
Primarily assigned technology terms:
- algorithm
- automated acquisition
- automatic generation
- categorization
- clustering
- cognate identification
- content representation
- cross-language information retrieval
- cross-language text retrieval
- databases
- decomposition
- document normalization
- document retrieval
- heuristic strategy
- identification
- indexing
- information extraction
- information retrieval
- language acquisition
- language engineering
- lexical acquisition
- lexicalized content description
- lexicon acquisition
- machine translation
- matching
- mining
- morpheme decomposition
- morphological decomposition
- morphology
- n-gram matching
- normalization
- retrieving
- search
- second language acquisition
- segmentation
- semantic validation
- string match
- string transformation
- text mining
- text retrieval
- transcription
- validation
- vector comparison
- word translation
Other assigned terms:
- acronym
- affixes
- ambiguity
- approach
- case
- characters
- coefficient
- comparable corpora
- complex word
- context similarity
- context vector
- context vectors
- corpora
- corpus size
- dice
- dice coefficient
- dictionaries
- dictionary
- distribution
- document
- document collections
- error rate
- euclidean distance
- evaluation metrics
- french
- frequency list
- generation
- generation strategy
- hansard corpus
- heuristic
- hypotheses
- interlingua
- interlingual representation
- knowledge
- language pair
- language pairs
- large corpora
- lexeme
- lexical translation
- lexicon
- lexicon entries
- lexicon entry
- linguistic
- local context
- mapping
- mapping rules
- mappings
- meanings
- measure
- measures
- medical corpora
- mesh
- method
- monolingual corpora
- morpheme
- multilingual lexicon
- n-gram
- non-parallel corpora
- nouns
- parallel corpora
- precision
- procedure
- queries
- random sample
- relation
- research and development
- seed
- semantic
- semantic relations
- similarity judgment
- similarity measures
- similarity metrics
- spanish lexicon
- spanish subword lexicon
- stem
- stems
- string similarity
- sublanguage
- suffix
- suffixes
- synonym
- synonyms
- synonymy
- technique
- terms
- text
- text corpora
- text corpus
- thesaurus
- tokens
- transformation
- transformation rules
- translation rules
- translations
- usability
- user
- wildcard
- word
- word formation
- word frequency
- word stem
- wordnet
- words