C02-2003 |
give us about half the effect of
|
word bigrams
|
. Similarly , the per-query perplexity
|
D09-1096 |
information about the vocabulary and
|
word bigram
|
features to capture short range
|
C04-1168 |
pass of decoding , a multiclass
|
word bigram
|
of a lexicon of 37,000 words
|
C00-1047 |
1993 ) methods for extracting
|
word bigrams
|
have been widely used . Other
|
C02-1125 |
been widely used for extracting
|
word bigrams
|
. Some measures for termhood
|
C04-1147 |
behavior will be similar to a
|
word bigram
|
language model ( with different
|
D12-1106 |
implementation . Clusters are created using
|
word bigram
|
features after replacing numbers
|
C94-2198 |
nonzero . After that , the full
|
word bigram
|
is stored in compressed form
|
D13-1156 |
basic semantic unit . They used
|
word bigrams
|
as such language concepts . Their
|
C04-1067 |
about known words ( e.g. , POS or
|
word bigram
|
probability ) can be used . However
|
C00-1030 |
character features in addition to
|
word bigrams
|
. Although it is still early
|
C96-2136 |
powerful hmgua , ge models , stteh as
|
word bigram
|
, are required . ( Jelinek ,
|
D13-1156 |
set of concepts in S , e.g. ,
|
word bigrams
|
( Gillick and Favre , 2009 )
|
D09-1021 |
we add state -- specifically ,
|
word bigrams
|
at the start and end of constituents
|
A00-2035 |
thus there are 250,000 potential
|
word bigrams
|
, but only a tiny fraction of
|
C04-1015 |
+ + ( Och and Ney , 2003 ) and
|
word bigram
|
and trigram models learned by
|
D13-1117 |
unigrams : w_1 , w0 , w1 •
|
word bigram
|
: ( w_1 , w0 ) and ( w0 , w1
|
A00-2019 |
ungrammatical tag and function
|
word bigrams
|
by computing the x2 ( chi square
|
D13-1125 |
prior polarities -- e.g. using
|
word bigram
|
features ( Wang and Manning ,
|
D11-1106 |
about the surface string , such as
|
word bigrams
|
) , although some fea - tures
|