D11-1100 as features , we first perform stop-word removal . The words are not stemmed since
W05-0704 characters , removal of diacritics and stop-word removal have also been explored -LSB-
P12-2043 characters , removal of diacritics and stop-word removal have also been explored ( Xu
N03-1025 pendent . Various techniques such as stop-word removal or stemming require language
E14-4002 weighting scheme for unigrams without stop-words removal ( Uni . + SW ) . Then , we proposed
P11-2105 weighting . No word stemming or stop-word removal was performed . This dataset
P13-3007 each doc - ument , stemming and stop-word removal processes are adopted . Furthermore
D14-1149 among terms in the results , but stop-word removal , sentence segmentation and TF-IDF
D11-1040 preprocessing besides stemming and stop-word removal . We extract text snippets representing
S14-2011 language model . No lemmatization , stop-word removal and lowercase transformation
W02-0507 This step involves tokenization , stop-word removal , root extraction , and term
S14-2013 preprocessing techniques such as stop-word removal and TF-IDF . TF-IDF is a standard
W01-0507 twice are used . Both stemming and stop-word removal are performed . For compu - tation
S13-2062 implemented in three steps namely , stop-word removal , stemming , and removal of words
S14-2050 system , we first performed simple stop-words removal with the NLTK toolkit ( Bird
D10-1024 special pre-processing such as stop-word removal and frequency cutoffs . Also
D13-1103 common word , after stemming and stop-word removal . For each passage , a set of
P13-3020 and apply the stemming and the stop-word removal processes to the documents .
S12-1094 method have considered stemming , stop-word removal , part-of-speech tagging , longest
D08-1114 this case we did not use either stop-word removal or stemming as this has been
hide detail