H92-1041 of proportional assignment with word-based indexing languages . Figure 3 shows results
P01-1004 character-based indexing over word-based indexing is that there is no pre-processing
C00-1006 with character overlap . With word-based indexing , this would only be possible
H92-1041 The optimal feature set size for word-based indexing was found to be surprisingly
P01-1004 indexing performs comparably to word-based indexing . In analogous research , Baldwin
C00-1006 partitioned off into character-based and word-based indexing for the vm ` ious similarity
P01-1004 indexing is consistently superior to word-based indexing . Furthermore , the bagof-words
C00-1006 produces a superior match accuracy to word-based indexing tbr all similarity metrics ,
C00-1006 2 . O3 for character-based and word-based indexing , respectively . All methods
C00-1006 not stem or lemmatise words in word-based indexing . Having said this , the . output
C00-1006 ious similarity methods . For word-based indexing , seginentation was carried out
C00-1006 sequential correspondence tbr word-based indexing , but tile word order-based methods
C00-1006 indexing performs comparably with word-based indexing in Japanese information retrieval
C00-1006 conservatively for character-based than word-based indexing . The most robust method is (
C00-1006 methods for t ) oth characterand word-based indexing , peaking at just over 50 % for
P01-1004 ( 2000 ) compared characterand word-based indexing within a Japanese -- English
C00-1006 " , but would not match under word-based indexing . Character-based index - ing
C00-1006 number of string comparisons in word-based indexing evaluation for VSM , token in
P01-1004 corpus , under both character - and word-based indexing , and with each of unigrams ,
C00-1006 performance for both character-based and word-based indexing . As such , this side-etfect
hide detail