ACL RD-TEC 1.0 Summarization of P06-2004
Paper Title:
THE EFFECT OF CORPUS SIZE IN COMBINING SUPERVISED AND UNSUPERVISED TRAINING FOR DISAMBIGUATION
THE EFFECT OF CORPUS SIZE IN COMBINING SUPERVISED AND UNSUPERVISED TRAINING FOR DISAMBIGUATION
Authors: Michaela Atterer and Hinrich Schütze
Primarily assigned technology terms:
- algorithm
- ambiguity resolution
- approximation
- coding
- collins parser
- computational linguistics
- computing
- database
- dependency parser
- disambiguation
- disambiguation algorithm
- disambiguation method
- entity recognition
- index construction
- indexing
- language processing
- learning
- lexicalized parser
- matching
- minipar
- named entity recognition
- natural language processing
- nlp
- nlp systems
- parser
- parsers
- parsing
- partial parsing
- pp disambiguation
- processing
- querying
- recognition
- search
- search engine
- search engines
- statistical parser
- supervised training
- syntactic disambiguation
- tile
- unsupervised approach
- unsupervised learning
- unsupervised training
Other assigned terms:
- adjective
- ambiguity
- analogy
- annotated corpora
- annotated corpus
- approach
- association for computational linguistics
- attachment ambiguity
- backoff
- backoff model
- bias
- case
- corpora
- corpus size
- data set
- data sets
- data structure
- dependency structures
- development set
- disambiguation system
- document
- european monetary system
- events
- experimental results
- fact
- gold standard
- head noun
- hypernym
- implementation
- index
- inverted index
- knowledge
- large training
- lattice
- lattices
- linguistics
- measure
- method
- modifier
- mutual information
- named entities
- named entity
- names
- natural language
- noise
- nouns
- parse
- penn treebank
- person names
- phrase
- phrase attachment
- pointwise mutual information
- pp attachment
- preposition
- prepositional phrase
- prepositional phrase attachment
- probabilities
- procedure
- proper noun
- queries
- query
- rc attachment
- relative clause
- relative clause attachment
- relative clauses
- reuters corpus
- sentence
- sentences
- set size
- sparse data
- statistics
- subtree
- subtrees
- test set
- text
- topics
- training
- training set
- treebank
- unannotated corpora
- unannotated corpus
- unannotated text
- unlabeled corpora
- unlabeled corpus
- verb
- verb attachment
- word
- word count
- words
- wsj corpus