#30772We present a statistical model of Japanese unknown words consisting of a set of length and spelling models classified by the character types that constitute aword.
other,34-2-P99-1036,ak
Chinese
</term>
(
<term>
kanji
</term>
) and
<term>
phonograms
</term>
like
<term>
English
</term>
(
<term>
katakana
#30808The point is quite simple: different character sets should be treated differently and the changes between character types are very important because Japanese script has both ideograms like Chinese (kanji) andphonograms like English (katakana).
measure(ment),5-3-P99-1036,ak
word segmentation accuracy
</term>
and
<term>
part of speech tagging accuracy
</term>
are improved by the proposed
<term>
#30820Both word segmentation accuracy andpart of speech tagging accuracy are improved by the proposed model.
measure(ment),1-3-P99-1036,ak
</term>
(
<term>
katakana
</term>
) . Both
<term>
word segmentation accuracy
</term>
and
<term>
part of speech tagging accuracy
#30816Bothword segmentation accuracy and part of speech tagging accuracy are improved by the proposed model.
model,15-3-P99-1036,ak
</term>
are improved by the proposed
<term>
model
</term>
. The
<term>
model
</term>
can achieve
#30830Both word segmentation accuracy and part of speech tagging accuracy are improved by the proposedmodel.
model,1-4-P99-1036,ak
the proposed
<term>
model
</term>
. The
<term>
model
</term>
can achieve 96.6 %
<term>
tagging accuracy
#30833Themodel can achieve 96.6% tagging accuracy if unknown words are correctly segmented.
other,6-1-P99-1036,ak
a
<term>
statistical model
</term>
of
<term>
Japanese unknown words
</term>
consisting of a set of
<term>
length
#30752We present a statistical model ofJapanese unknown words consisting of a set of length and spelling models classified by the character types that constitute a word.
other,17-2-P99-1036,ak
differently and the changes between
<term>
character types
</term>
are very important because
<term>
Japanese
#30791The point is quite simple: different character sets should be treated differently and the changes betweencharacter types are very important because Japanese script has both ideograms like Chinese (kanji) and phonograms like English (katakana).
other,29-2-P99-1036,ak
has both
<term>
ideograms
</term>
like
<term>
Chinese
</term>
(
<term>
kanji
</term>
) and
<term>
phonograms
#30803The point is quite simple: different character sets should be treated differently and the changes between character types are very important because Japanese script has both ideograms likeChinese (kanji) and phonograms like English (katakana).
other,21-1-P99-1036,ak
spelling models
</term>
classified by the
<term>
character types
</term>
that constitute a
<term>
word
</term>
#30767We present a statistical model of Japanese unknown words consisting of a set of length and spelling models classified by thecharacter types that constitute a word.
other,23-2-P99-1036,ak
types
</term>
are very important because
<term>
Japanese script
</term>
has both
<term>
ideograms
</term>
like
#30797The point is quite simple: different character sets should be treated differently and the changes between character types are very important becauseJapanese script has both ideograms like Chinese (kanji) and phonograms like English (katakana).
other,9-4-P99-1036,ak
96.6 %
<term>
tagging accuracy
</term>
if
<term>
unknown words
</term>
are correctly segmented . We propose
#30841The model can achieve 96.6% tagging accuracy ifunknown words are correctly segmented.
model,14-1-P99-1036,ak
words
</term>
consisting of a set of
<term>
length and spelling models
</term>
classified by the
<term>
character
#30760We present a statistical model of Japanese unknown words consisting of a set oflength and spelling models classified by the character types that constitute a word.
model,3-1-P99-1036,ak
Construct Algebra
</term>
. We present a
<term>
statistical model
</term>
of
<term>
Japanese unknown words
</term>
#30749We present astatistical model of Japanese unknown words consisting of a set of length and spelling models classified by the character types that constitute a word.
other,36-2-P99-1036,ak
</term>
) and
<term>
phonograms
</term>
like
<term>
English
</term>
(
<term>
katakana
</term>
) . Both
<term>
#30810The point is quite simple: different character sets should be treated differently and the changes between character types are very important because Japanese script has both ideograms like Chinese (kanji) and phonograms likeEnglish (katakana).
other,7-2-P99-1036,ak
point is quite simple : different
<term>
character sets
</term>
should be treated differently and
#30781The point is quite simple: differentcharacter sets should be treated differently and the changes between character types are very important because Japanese script has both ideograms like Chinese (kanji) and phonograms like English (katakana).
measure(ment),6-4-P99-1036,ak
<term>
model
</term>
can achieve 96.6 %
<term>
tagging accuracy
</term>
if
<term>
unknown words
</term>
are correctly
#30838The model can achieve 96.6%tagging accuracy if unknown words are correctly segmented.
other,27-2-P99-1036,ak
<term>
Japanese script
</term>
has both
<term>
ideograms
</term>
like
<term>
Chinese
</term>
(
<term>
kanji
#30801The point is quite simple: different character sets should be treated differently and the changes between character types are very important because Japanese script has bothideograms like Chinese (kanji) and phonograms like English (katakana).