#4764The resulting Arabic word segmentation system achieves around 97% exact match accuracy on a test corpus containing 28,449 word tokens .
model,27-5-P03-1051,ak
corpus
</term>
, and re-estimate the
<term>
model parameters
</term>
with the expanded
<term>
vocabulary
#4735To improve the segmentation accuracy, we use an unsupervised algorithm for automatically acquiring new stems from a 155 million word unsegmented corpus, and re-estimate the model parameters with the expanded vocabulary and training corpus.