ACL RD-TEC 1.0 Summarization of P06-1073

Paper Title:
MAXIMUM ENTROPY BASED RESTORATION OF ARABIC DIACRITICS

Authors: Imed Zitouni and Jeffrey S. Sorensen and Ruhi Sarikaya

Other assigned terms:

  • acoustic signal
  • affixes
  • alphabet
  • ambiguity
  • approach
  • arabic text
  • arabic treebank
  • array
  • association for computational linguistics
  • binary features
  • case
  • character sequence
  • characters
  • class probability
  • classification problem
  • comparative study
  • conditional probability
  • context information
  • contextual information
  • diacritization error rate
  • distribution
  • document
  • english translation
  • entropy
  • error rate
  • experimental results
  • feature
  • feature sets
  • feature space
  • feature types
  • formal speech
  • grammar
  • grapheme
  • heuristic
  • hmm model
  • hypotheses
  • inflection
  • information sources
  • input text
  • interpretation
  • knowledge
  • labeling
  • language processing tasks
  • lattice
  • lexical features
  • lexicon
  • likelihood
  • linguistics
  • markov models
  • markov sequence
  • maxent model
  • meaning
  • method
  • modern standard arabic
  • morphological information
  • n-gram
  • n-gram model
  • n-gram models
  • n-grams
  • named entity
  • natural language
  • natural language processing tasks
  • normalization factor
  • opinion
  • parse
  • parsing model
  • part-of-speech
  • part-of-speech tag
  • pause
  • probabilities
  • probability
  • probability distribution
  • processing tasks
  • pronoun
  • runtime
  • search space
  • search task
  • segment-based information
  • segments
  • sentence
  • sentence meaning
  • signal
  • sources of information
  • standard arabic
  • statistical model
  • statistical models
  • stem
  • suffix
  • suffixes
  • symbols
  • syntactic information
  • system performance
  • tag sequence
  • tagging problem
  • tags
  • technique
  • term
  • terms
  • testing data
  • testing set
  • text
  • training
  • training and testing data
  • training corpus
  • training data
  • training examples
  • training phase
  • training set
  • treebank
  • treebank corpus
  • trigram
  • utterance
  • verb
  • vowel
  • word
  • word error rate
  • word level
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***