ACL RD-TEC 1.0 Summarization of P98-2152

Paper Title:
JAPANESE OCR ERROR CORRECTION USING CHARACTER SHAPE SIMILARITY AND STATISTICAL LANGUAGE MODEL

Other assigned terms:

  • approximate word match
  • beam
  • bigram
  • bigram model
  • boundary marker
  • case
  • character bigram model
  • character sequence
  • characters
  • cluster
  • confusion matrix
  • confusion probability
  • corpora
  • dictionaries
  • dictionary
  • distance metric
  • distribution
  • document
  • edit distance
  • edr corpus
  • electric engineering
  • english speech
  • events
  • fact
  • feature
  • feature vector
  • feature vectors
  • foreign words
  • geometric distribution
  • handwriting
  • heuristic
  • hypotheses
  • hypothesis
  • index
  • input string
  • inverted index
  • japanese corpus
  • japanese sentences
  • joint probability
  • language model
  • language models
  • likelihood
  • linguistic
  • measures
  • method
  • ngram
  • noisy channel
  • part of speech
  • perplexity
  • poisson distribution
  • priori
  • probabilities
  • probability
  • procedure
  • process
  • pronunciation
  • rank order
  • recognition accuracy
  • recognition errors
  • sentence
  • sentences
  • statistical language model
  • substring
  • symbol
  • symbols
  • technique
  • technology
  • test data
  • test set
  • text
  • training
  • training corpus
  • training data
  • training set
  • unigram
  • unigram probability
  • vocabulary
  • word
  • word bigram model
  • word boundaries
  • word boundary
  • word model
  • word perplexity
  • word sequence
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***