Our
<term>
resource-frugal approach
</term>
results in 87.5 %
<term>
agreement
</term>
with a state of the art , proprietary
<term>
Arabic stemmer
</term>
built using
<term>
rules
</term>
,
<term>
affix lists
</term>
, and
<term>
human annotated text
</term>
, in addition to an
<term>
unsupervised component
</term>
.
#4554Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietary Arabic stemmer built usingrules, affix lists, and human annotated text, in addition to an unsupervised component.
other,16-7-P03-1050,ak
<term>
Task-based evaluation
</term>
using
<term>
Arabic information retrieval
</term>
indicates an improvement of 22-38 % in
<term>
average precision
</term>
over
<term>
unstemmed text
</term>
, and 96 % of the performance of the proprietary
<term>
stemmer
</term>
above .
#4587Task-based evaluation using Arabic information retrieval indicates an improvement of 22-38% in average precision overunstemmed text, and 96% of the performance of the proprietary stemmer above.
lr,22-6-P03-1050,ak
Our
<term>
resource-frugal approach
</term>
results in 87.5 %
<term>
agreement
</term>
with a state of the art , proprietary
<term>
Arabic stemmer
</term>
built using
<term>
rules
</term>
,
<term>
affix lists
</term>
, and
<term>
human annotated text
</term>
, in addition to an
<term>
unsupervised component
</term>
.
#4556Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietary Arabic stemmer built using rules,affix lists, and human annotated text, in addition to an unsupervised component.
other,7-5-P03-1050,ak
Examples and results will be given for
<term>
Arabic
</term>
, but the approach is applicable to any
<term>
language
</term>
that needs
<term>
affix removal
</term>
.
#4519Examples and results will be given forArabic, but the approach is applicable to any language that needs affix removal.
lr,26-6-P03-1050,ak
Our
<term>
resource-frugal approach
</term>
results in 87.5 %
<term>
agreement
</term>
with a state of the art , proprietary
<term>
Arabic stemmer
</term>
built using
<term>
rules
</term>
,
<term>
affix lists
</term>
, and
<term>
human annotated text
</term>
, in addition to an
<term>
unsupervised component
</term>
.
#4560Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietary Arabic stemmer built using rules, affix lists, andhuman annotated text, in addition to an unsupervised component.
other,22-4-P03-1050,ak
<term>
Monolingual , unannotated text
</term>
can be used to further improve the
<term>
stemmer
</term>
by allowing it to adapt to a desired
<term>
domain
</term>
or
<term>
genre
</term>
.
#4510Monolingual, unannotated text can be used to further improve the stemmer by allowing it to adapt to a desired domain orgenre.
other,7-3-P03-1050,ak
No
<term>
parallel text
</term>
is needed after the
<term>
training phase
</term>
.
#4485No parallel text is needed after thetraining phase.
lr,27-2-P03-1050,ak
The
<term>
stemming model
</term>
is based on
<term>
statistical machine translation
</term>
and it uses an
<term>
English stemmer
</term>
and a
<term>
small ( 10K sentences ) parallel corpus
</term>
as its sole
<term>
training resources
</term>
.
#4475The stemming model is based on statistical machine translation and it uses an English stemmer and a small (10K sentences) parallel corpus as its soletraining resources.
tech,19-5-P03-1050,ak
Examples and results will be given for
<term>
Arabic
</term>
, but the approach is applicable to any
<term>
language
</term>
that needs
<term>
affix removal
</term>
.
#4531Examples and results will be given for Arabic, but the approach is applicable to any language that needsaffix removal.
tech,34-6-P03-1050,ak
Our
<term>
resource-frugal approach
</term>
results in 87.5 %
<term>
agreement
</term>
with a state of the art , proprietary
<term>
Arabic stemmer
</term>
built using
<term>
rules
</term>
,
<term>
affix lists
</term>
, and
<term>
human annotated text
</term>
, in addition to an
<term>
unsupervised component
</term>
.
#4568Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietary Arabic stemmer built using rules, affix lists, and human annotated text, in addition to anunsupervised component.
tech,10-1-P03-1050,ak
This paper presents an
<term>
unsupervised learning approach
</term>
to building a
<term>
non-English ( Arabic ) stemmer
</term>
.
#4442This paper presents an unsupervised learning approach to building anon-English ( Arabic ) stemmer.
tech,28-7-P03-1050,ak
<term>
Task-based evaluation
</term>
using
<term>
Arabic information retrieval
</term>
indicates an improvement of 22-38 % in
<term>
average precision
</term>
over
<term>
unstemmed text
</term>
, and 96 % of the performance of the proprietary
<term>
stemmer
</term>
above .
#4599Task-based evaluation using Arabic information retrieval indicates an improvement of 22-38% in average precision over unstemmed text, and 96% of the performance of the proprietarystemmer above.
tech,13-2-P03-1050,ak
The
<term>
stemming model
</term>
is based on
<term>
statistical machine translation
</term>
and it uses an
<term>
English stemmer
</term>
and a
<term>
small ( 10K sentences ) parallel corpus
</term>
as its sole
<term>
training resources
</term>
.
#4461The stemming model is based on statistical machine translation and it uses anEnglish stemmer and a small (10K sentences) parallel corpus as its sole training resources.
tech,6-2-P03-1050,ak
The
<term>
stemming model
</term>
is based on
<term>
statistical machine translation
</term>
and it uses an
<term>
English stemmer
</term>
and a
<term>
small ( 10K sentences ) parallel corpus
</term>
as its sole
<term>
training resources
</term>
.
#4454The stemming model is based onstatistical machine translation and it uses an English stemmer and a small (10K sentences) parallel corpus as its sole training resources.
lr,17-2-P03-1050,ak
The
<term>
stemming model
</term>
is based on
<term>
statistical machine translation
</term>
and it uses an
<term>
English stemmer
</term>
and a
<term>
small ( 10K sentences ) parallel corpus
</term>
as its sole
<term>
training resources
</term>
.
#4465The stemming model is based on statistical machine translation and it uses an English stemmer and asmall ( 10K sentences ) parallel corpus as its sole training resources.
tech,16-6-P03-1050,ak
Our
<term>
resource-frugal approach
</term>
results in 87.5 %
<term>
agreement
</term>
with a state of the art , proprietary
<term>
Arabic stemmer
</term>
built using
<term>
rules
</term>
,
<term>
affix lists
</term>
, and
<term>
human annotated text
</term>
, in addition to an
<term>
unsupervised component
</term>
.
#4550Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietaryArabic stemmer built using rules, affix lists, and human annotated text, in addition to an unsupervised component.
tech,11-4-P03-1050,ak
<term>
Monolingual , unannotated text
</term>
can be used to further improve the
<term>
stemmer
</term>
by allowing it to adapt to a desired
<term>
domain
</term>
or
<term>
genre
</term>
.
#4499Monolingual, unannotated text can be used to further improve thestemmer by allowing it to adapt to a desired domain or genre.
lr,0-4-P03-1050,ak
No
<term>
parallel text
</term>
is needed after the
<term>
training phase
</term>
.
<term>
Monolingual , unannotated text
</term>
can be used to further improve the
<term>
stemmer
</term>
by allowing it to adapt to a desired
<term>
domain
</term>
or
<term>
genre
</term>
.
#4488No parallel text is needed after the training phase.Monolingual , unannotated text can be used to further improve the stemmer by allowing it to adapt to a desired domain or genre.
tech,3-7-P03-1050,ak
<term>
Task-based evaluation
</term>
using
<term>
Arabic information retrieval
</term>
indicates an improvement of 22-38 % in
<term>
average precision
</term>
over
<term>
unstemmed text
</term>
, and 96 % of the performance of the proprietary
<term>
stemmer
</term>
above .
#4574Task-based evaluation usingArabic information retrieval indicates an improvement of 22-38% in average precision over unstemmed text, and 96% of the performance of the proprietary stemmer above.
model,1-2-P03-1050,ak
The
<term>
stemming model
</term>
is based on
<term>
statistical machine translation
</term>
and it uses an
<term>
English stemmer
</term>
and a
<term>
small ( 10K sentences ) parallel corpus
</term>
as its sole
<term>
training resources
</term>
.
#4449Thestemming model is based on statistical machine translation and it uses an English stemmer and a small (10K sentences) parallel corpus as its sole training resources.