D14-1092 evaluating the performance of Chinese word segmentation systems . Two of the four datasets are
C04-1067 matching method . Character Tagging A word segmentation system using the character tagging method
C04-1067 , 2003 ) . Maximum Matching A word segmentation system using the well-known maximum
C04-1067 other methods . Many practical word segmentation systems add candidates of unknown words
C96-1039 three single characters by our word segmentation system . From the viewpoint of personal
C96-1039 ) . It is just segmented by a word segmentation system without checking manually . Although
D13-1005 begin by evaluating our model as a word segmentation system . ( Table 1 gives segmentation
D11-1089 order to show how well existing word segmentation systems perform this task . Although
D08-1111 considered as the best Chinese word segmentation systems . We chose ICTCLAS as the comparison
C04-1081 fortunately , building a Chinese word segmentation system is complicated by the fact that
D14-1092 bound for any unsupervised Chinese word segmentation systems . We also use it as the topline
D14-1092 different types of unsupervised word segmentation systems . This paper is organized as
D11-1089 , even for the state-of-theart word segmentation systems . On the other hand , PROPOSED
D11-1089 by , typically , using existent word segmentation systems . This is , however , not appropriate
D14-1092 influencing accuracy of Chinese word segmentation systems ( Huang and Zhao , 2007 ) . We
D14-1092 testing set T0 to test several word segmentation systems , there are N testing examples
C94-2209 -- 8 \ -RSB- ) . Many automatic word segmentation systems adopting the above models have
H01-1057 researchers had implemented Thai word segmentation systems based on using a dictionary (
D14-1092 and comparison for unsupervised word segmentation systems , an important issue is what
C96-1039 corpus . It is segmented by a word segmentation system and is checked manually . In
hide detail