#2776We present an implementation of the model based on finite-state models, demonstrate the model's ability to significantly reduce character and word error rate, and provide evaluation results involving automatic extraction of translation lexicons from printed text.
other,38-1-N03-1018,ak
through its transformation into the
<term>
noisy output
</term>
of an
<term>
OCR system
</term>
. The
#2706In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing from generation of true text through its transformation into the noisy output of an OCR system.