ACL RD-TEC 1.0 Summarization of W02-2006
Paper Title:
BOOTSTRAPPING A MULTILINGUAL PART-OF-SPEECH TAGGER IN ONE PERSON-DAY
BOOTSTRAPPING A MULTILINGUAL PART-OF-SPEECH TAGGER IN ONE PERSON-DAY
Authors: Silviu Cucerzan and David Yarowsky
Primarily assigned technology terms:
- algorithm
- bootstrapping
- data acquisition
- data collection
- dictionary extraction
- induction
- induction algorithm
- iterative alignment
- learning
- levenshtein
- machine translation
- matching
- model estimation
- model estimation and re-estimation
- model induction
- modeling
- morphology
- parser
- parsing
- part-of-speech tagger
- part-of-speech tagging
- phrasal translation
- phrase translation
- processing
- re-estimation
- smoothing
- string match
- supervised bootstrapping
- supervised learning
- tagger
- taggers
- tagging
- unsupervised induction
Other assigned terms:
- adjective
- adverb
- affix
- affixes
- alignment models
- ambiguity
- annotation
- approach
- auxiliary verbs
- bilingual dictionaries
- bilingual dictionary
- case
- characters
- cluster
- corpora
- dependency model
- determiner
- determiners
- dictionaries
- dictionary
- dictionary entries
- distribution
- english translation
- english translations
- estimation
- evaluation data
- feature
- feature agreement
- foreign language
- foreign word
- foreign words
- gender agreement
- generation
- generative model
- grammar
- grammars
- hypothesis
- independence assumption
- inflected forms
- inflection
- knowledge
- language dictionary
- linguistic
- linguistic resources
- meaning
- measure
- measures
- method
- methodology
- monolingual corpora
- monolingual corpus
- noise
- nouns
- part of speech
- part-of-speech
- part-of-speech tag
- part-of-speech tags
- particle
- parts of speech
- parts-ofspeech
- phrase
- pos distribution
- pos sequence
- pos tag
- possible translation
- preposition
- probabilities
- probability
- probability distribution
- probability distributions
- pronoun
- pronouns
- proper name
- proper noun
- punctuation
- seed
- sequence probability
- signal
- speech tag
- statistics
- stem
- suffix
- tag sequence
- tag sequence probability
- tags
- tagset
- term
- terms
- test data
- text
- text corpus
- textbook
- theory
- tokens
- training
- training corpora
- training data
- translation candidate
- translations
- unannotated corpora
- verb
- word
- words
- wsj corpora