ACL RD-TEC 1.0 Summarization of W06-3208
Paper Title:
MORPHOLOGY INDUCTION FROM LIMITED NOISY DATA USING APPROXIMATE STRING MATCHING
MORPHOLOGY INDUCTION FROM LIMITED NOISY DATA USING APPROXIMATE STRING MATCHING
Authors: Burcu Karagol-Ayan and David Doermann and Amy Weinberg
Primarily assigned technology terms:
- algorithm
- approximate string matching
- bootstrapping
- co-training
- dynamic programming
- edit distance algorithm
- editing
- illustration
- induction
- induction framework
- language morphology
- language morphology induction
- latent semantic analysis
- learner
- learning
- matching
- maximum likelihood
- morphological analysis
- morphological analyzers
- morphologizer
- morphology
- morphology induction
- morphology learner
- morphology learning
- nlp
- ranking
- search
- search engines
- searching
- segmentation
- segmentation process
- segmenter
- semantic analysis
- splitting
- statistical methods
- string matching
- string searching
- thresholding
- web search
- word guessing
- word morphologizer
- word segmentation
- word segmenter
Other assigned terms:
- affix
- affixes
- ambiguity
- annotated treebank
- approach
- bilingual dictionaries
- case
- characters
- data set
- data sets
- dictionaries
- dictionary
- dictionary entries
- distance matrix
- edit distance
- error rate
- exact match
- fact
- generative probability
- generative probability model
- heuristic
- heuristics
- inflected forms
- inflection
- knowledge
- latent semantic
- lattice
- lattice structure
- lexicon
- likelihood
- linguists
- minimum description length
- morpheme
- morphemes
- mutual information
- natural language
- natural language morphology
- noise
- probability
- probability model
- process
- punctuation
- russian
- segments
- semantic
- semantic similarity
- sentence
- sentences
- stem
- stems
- string edit distance
- substring
- suffix
- suffixes
- terms
- text
- training
- training data
- treebank
- treebank corpus
- vowel
- word
- word morphology
- word pair
- words