tech,17-2-P03-1051,bq Our method is seeded by a small <term> manually segmented Arabic corpus </term> and uses it to bootstrap an <term> unsupervised algorithm </term> to build the <term> Arabic word segmenter </term> from a large <term> unsegmented Arabic corpus </term> .
tech,9-5-P03-1051,bq To improve the <term> segmentation </term><term> accuracy </term> , we use an <term> unsupervised algorithm </term> for automatically acquiring new <term> stems </term> from a 155 million <term> word </term><term> unsegmented corpus </term> , and re-estimate the <term> model parameters </term> with the expanded <term> vocabulary </term> and <term> training corpus </term> .
hide detail