A00-2035 |
not is absolutely crucial for
|
sentence splitting
|
. Unfortunately , abbreviations
|
A00-2035 |
the local context of potential
|
sentence splitting
|
punctu - ation . However , what
|
D11-1038 |
and apposition . An example of a
|
sentence splitting
|
rule is illustrated in Figure
|
A00-2035 |
which follows the period or other
|
sentence splitting
|
punctuation . In general , when
|
D11-1038 |
commonest simplification operations is
|
sentence splitting
|
which usually produces longer
|
C04-1017 |
system is equipped with another
|
sentence splitting
|
method based on parsing trees
|
D11-1038 |
and rules ( 4 ) -- ( 6 ) involve
|
sentence splitting
|
. Examples of common lexical
|
C04-1017 |
well . In order to supplement
|
sentence splitting
|
based on word-sequence characteristics
|
A00-1012 |
another problem sometimes called "
|
sentence splitting
|
" . This problem aims to identify
|
D15-1063 |
linguistic preprocessing steps are
|
sentence splitting
|
and tokenization . Thus , we
|
A00-2035 |
Aberdeen et al. , 1995 ) contains a
|
sentence splitting
|
module which employs over 100
|
C02-1027 |
filtered by special constraints . The
|
sentence splitting
|
and tokenising rules were adapted
|
C04-1017 |
method , we generate candidates for
|
sentence splitting
|
based on N-grams , and select
|
C02-1027 |
pre-processing modules for tokenisation ,
|
sentence splitting
|
, paragraph segmentation , partof-speech
|
A00-2035 |
improvement in the performance on
|
sentence splitting
|
and about a 40 % improvement
|
C02-1027 |
includes modules for POS tag - ging ,
|
sentence splitting
|
, clause segmentation , parsing
|
A00-2035 |
reducing the ambiguity for the
|
sentence splitting
|
module . The second row of Table
|
A00-2035 |
achieved a 0.65 % error rate on
|
sentence splitting
|
on the Brown Corpus and 1.39
|
D15-1157 |
errors that result from noisy
|
sentence splitting
|
and tokenisation that must be
|
D15-1148 |
separately model the need for
|
sentence splitting
|
( Zhu et al. , 2010 ; Woodsend
|