#2706In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing from generation of true text through its transformation into the noisy output of an OCR system.
tech,31-3-N03-1018,ak
provide evaluation results involving
<term>
automatic extraction
</term>
of
<term>
translation lexicons
</term>
#2776We present an implementation of the model based on finite-state models, demonstrate the model's ability to significantly reduce character and word error rate, and provide evaluation results involving automatic extraction of translation lexicons from printed text.