other,11-1-P03-1051,bq | </term> by a <term> model </term> that a <term> | word | </term> consists of a sequence of <term> morphemes | #4611 We approximate Arabic's rich morphology by a model that a word consists of a sequence of morphemes in the pattern prefix*-stem-suffix* (* denotes zero or more occurrences of a morpheme). |
other,20-5-P03-1051,bq | <term> stems </term> from a 155 million <term> | word | </term> <term> unsegmented corpus </term> , | #4726 To improve the segmentation accuracy, we use an unsupervised algorithm for automatically acquiring new stems from a 155 million word unsegmented corpus, and re-estimate the model parameters with the expanded vocabulary and training corpus. |