ACL RD-TEC 1.0 Summarization of P06-1103
Paper Title:
WEAKLY SUPERVISED NAMED ENTITY TRANSLITERATION AND DISCOVERY FROM MULTILINGUAL COMPARABLE CORPORA
WEAKLY SUPERVISED NAMED ENTITY TRANSLITERATION AND DISCOVERY FROM MULTILINGUAL COMPARABLE CORPORA
Authors: Alexandre Klementiev and Dan Roth
Primarily assigned technology terms:
- algorithm
- bootstrapping
- classification
- computational linguistics
- computing
- coupling
- crawling
- discriminative approach
- discriminative learning
- entity recognition
- identification
- information extraction
- iterative algorithm
- iterative training
- language morphology
- language processing
- learning
- learning algorithm
- learning framework
- learning techniques
- machine learning
- machine learning techniques
- matching
- morphology
- named entity recognition
- natural language processing
- ne discovery
- ne extraction
- ne transliteration
- nlp
- perceptron
- pre-processing
- processing
- question answering
- recognition
- scoring
- scoring function
- sequence alignment
- sequence matching
- sequence scoring
- similarity scoring
- supervised training
- thresholding
- time sequence scoring
- transliteration
- unsupervised learning
- unsupervised learning algorithm
- word alignment
Other assigned terms:
- aligned corpus
- annotation
- annotation effort
- approach
- approach to transliteration
- association for computational linguistics
- bilingual corpora
- bilingual corpus
- case
- coefficient
- comparable corpora
- comparable corpus
- corpora
- correlation
- dictionary
- dictionary definition
- distribution
- empty string
- euclidean distance
- fact
- feature
- feature space
- feature vector
- generation
- generative model
- generative models
- hand-tagged corpus
- histogram
- knowledge
- language knowledge
- language processing tasks
- likelihood
- linear model
- linguistics
- measure
- measures
- method
- named entities
- named entity
- natural language
- natural language processing tasks
- news corpus
- news web site
- nlp tasks
- phonetic sequence
- positive and negative examples
- probability
- processing tasks
- running time
- russian
- russian transliteration
- scoring metric
- set size
- signal
- similarity function
- similarity score
- source text
- sources of information
- stem
- substring
- target language
- temporal signature
- text
- tokens
- training
- training data
- training example
- training examples
- training set
- translations
- transliteration candidate
- transliteration model
- uniform probability
- untagged corpora
- web page
- web site
- window size
- word
- words