tech,22-2-P03-1051,ak Our method is seeded by a <term> small manually segmented Arabic corpus </term> and uses it to bootstrap an <term> unsupervised algorithm </term> to build the <term> Arabic word segmenter </term> from a <term> large unsegmented Arabic corpus </term> .
measure(ment),10-6-P03-1051,ak The resulting <term> Arabic word segmentation system </term> achieves around 97 % <term> exact match accuracy </term> on a <term> test corpus </term> containing 28,449 <term> word tokens </term> .
hide detail