#4550Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietaryArabic stemmer built using rules, affix lists, and human annotated text, in addition to an unsupervised component.
tech,19-5-P03-1050,ak
any
<term>
language
</term>
that needs
<term>
affix removal
</term>
. Our
<term>
resource-frugal approach
#4531Examples and results will be given for Arabic, but the approach is applicable to any language that needsaffix removal.
tech,28-7-P03-1050,ak
the performance of the proprietary
<term>
stemmer
</term>
above . We approximate
<term>
Arabic
#4599Task-based evaluation using Arabic information retrieval indicates an improvement of 22-38% in average precision over unstemmed text, and 96% of the performance of the proprietarystemmer above.
tech,3-7-P03-1050,ak
<term>
Task-based evaluation
</term>
using
<term>
Arabic information retrieval
</term>
indicates an improvement of 22-38
#4574Task-based evaluation usingArabic information retrieval indicates an improvement of 22-38% in average precision over unstemmed text, and 96% of the performance of the proprietary stemmer above.
tech,34-6-P03-1050,ak
annotated text
</term>
, in addition to an
<term>
unsupervised component
</term>
.
<term>
Task-based evaluation
</term>
#4568Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietary Arabic stemmer built using rules, affix lists, and human annotated text, in addition to anunsupervised component.
tech,4-1-P03-1050,ak
users
</term>
. This paper presents an
<term>
unsupervised learning approach
</term>
to building a
<term>
non-English (
#4436This paper presents anunsupervised learning approach to building a non-English (Arabic) stemmer.
tech,6-2-P03-1050,ak
<term>
stemming model
</term>
is based on
<term>
statistical machine translation
</term>
and it uses an
<term>
English stemmer
#4454The stemming model is based onstatistical machine translation and it uses an English stemmer and a small (10K sentences) parallel corpus as its sole training resources.