C04-1081 used in a recent comprehensive Chinese word segmentation competition . State-of-the-art
C02-1148 based approach is the most popular Chinese word segmentation method . The idea is to use a
C00-1026 compound words is very similar to Chinese word segmentation process . It requires dictionary
C04-1067 Asahara et al. ( 2003 ) studied Chinese word segmentation based on a character tagging
C04-1067 processing . Xue ( 2003 ) studied Chinese word segmentation using the character tagging method
C02-1148 , we employ three families of Chinese word segmentation algorithms from the recent literature
C04-1081 are a viable model for robust Chinese word segmentation . 2 Conditional Random Fields
C04-1081 CRFs are a promising approach for Chinese word segmentation . New word detection is one of
A97-1018 obstacles block the progress of Chinese word segmentation : one is ambiguity , another
C04-1081 Un - fortunately , building a Chinese word segmentation system is complicated by the
C02-1148 2 Word Segmentation Algorithms Chinese word segmentation has been extensively re - searched
C02-1145 for use as preprocessors . to Chinese Word Segmentation Using the data from CTB-I , we
C04-1081 or many num - bers . A recent Chinese word segmentation competition ( Sproat and Emerson
C02-1148 decrease . The relationship between Chinese word segmentation accuracy and information retrieval
A97-1018 Proper Noun Bank r cha Abstract Chinese word segmentation and POS tagging are two key techniques
C04-1081 three-fold . First , we apply CRFs to Chinese word segmentation and find that they achieve state-of-the
C04-1081 baseline for future research on Chinese word segmentation . Acknowledgments This work was
C04-1081 of the datasets from a recent Chinese word segmentation bake-off competition ( Sproat
C92-4173 methods are not appropriate for Chinese word segmentation because almost every Chinese
C02-1145 machine learning approaches to Chinese word segmentation crucially hinge on the observation
hide detail