ACL RD-TEC 1.0 Summarization of W96-0108

Paper Title:
A STATISTICAL APPROACH TO AUTOMATIC OCR ERROR CORRECTION IN CONTEXT

Authors: Xiang Tong and David A. Evans

Other assigned terms:

  • approach
  • automatic correction
  • back-off model
  • bigram
  • character sequence
  • characters
  • conditional probabilities
  • conditional probability
  • confusion probability
  • confusion probability table
  • context information
  • device
  • dictionary
  • dictionary entries
  • discourse
  • discourse structures
  • edit distance
  • error rate
  • error reduction rate
  • evaluations
  • events
  • fact
  • feature
  • generation
  • heuristics
  • index
  • input string
  • language model
  • language models
  • lexicon
  • lexicon entries
  • lexicon entry
  • meaning
  • method
  • n-gram
  • n-gram vector
  • n-grams
  • natural-language
  • part-of-speech
  • prior probability
  • probabilities
  • probability
  • process
  • processing tasks
  • query
  • query vector
  • sentence
  • sentences
  • source text
  • statistical approach
  • statistics
  • substring
  • system performance
  • tags
  • target string
  • technique
  • technology
  • term
  • term frequency
  • test corpus
  • test set
  • text
  • training
  • training corpus
  • training data
  • training set
  • training text
  • transposition
  • trigram
  • vector space
  • word
  • word boundaries
  • word boundary
  • word error rate
  • word meaning
  • word sequence
  • word trigram
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***