TOKENIZATION: Related Papers in ACL Anthology
Back to Document Index
Back to Term Index
-
Concordance view (Keyword-In-Context) for the term tokenization in the ACL ARC 2.0;
- Concordance view in the ACL ARC 1.0 (Sketch Engine Service).
TOKENIZATION can be found in the following ACL ARC 1.0 documents (click to explore):
- (ACL ID: A00-1032) language independent morphological analysis
- (ACL ID: A00-1033) a divide-and-conquer strategy for shallow parsing of german free texts
- (ACL ID: A00-2035) tagging sentence boundaries
- (ACL ID: A00-2040) a finite state and data-oriented method for grapheme to phoneme conversion
- (ACL ID: A92-1018) a practical part-of-speech tagger
- (ACL ID: A92-1047) lexical processing in the clare system
- (ACL ID: A94-1030) improving chinese tokenization with linguistic filters on statistical lexical acquisition
- (ACL ID: A97-1051) mixed-initiative development of language processing systems
- (ACL ID: C00-1020) a client/server architecture for word sense disambiguation
- (ACL ID: C00-2095) a formalism for universal segmentation of text
- (ACL ID: C00-2126) word order acquisition from corpora
- (ACL ID: C00-2144) thistle and interarbora
- (ACL ID: C00-2152) an integrated architecture for example-based machine translation
- (ACL ID: C00-2168) acquisition of a language computational model for nlp
- (ACL ID: C02-1002) a cheap and fast way to build useful translation lexicons
- (ACL ID: C02-1021) (semi-)automatic detection of errors in pos-tagged corpora
- (ACL ID: C02-1094) integrating linguistic and performance-based constraints for assigning phrase breaks
- (ACL ID: C02-2005) scaled log likelihood ratios for the detection of abbreviations in text corpora
- (ACL ID: C04-1018) playing the telephone game
- (ACL ID: C04-1021) modern natural language interfaces to databases
- (ACL ID: C04-1075) a high-performance coreference resolution system using a constraint-based multi-agent strategy
- (ACL ID: C04-1109) discriminative slot detection using kernel methods
- (ACL ID: C04-1156) knowledge intensive word alignment with knowa
- (ACL ID: C04-1163) a semantic-based approach to interoperabiltity of classification hierarchies
- (ACL ID: C92-1033) ttp
- (ACL ID: C92-1063) the typology of unknown words
- (ACL ID: C92-4173) tokenization as the initial phase in nlp
- (ACL ID: C96-1020) beyond skeleton parsing
- (ACL ID: C96-2136) context-based spelling correction for japanese ocr
- (ACL ID: E06-1032) re-evaluation the role of bleu in machine translation research
- (ACL ID: E06-1047) parsing arabic dialects
- (ACL ID: E06-1051) exploiting shallow linguistic information for relation extraction from biomedical literature
- (ACL ID: E06-2024) a suite of shallow processing tools for portuguese
- (ACL ID: E95-1010) text alignment in the real world
- (ACL ID: E99-1001) named entity recognition without gazetteers
- (ACL ID: E99-1013) complementing wordnet with roget's and corpus-based thesauri for information retrieval
- (ACL ID: E99-1026) japanese dependency structure analysis based on maximum entropy models
- (ACL ID: E99-1045) encoding a parallel corpus for automatic terminology extraction
- (ACL ID: H01-1038) integrated feasibility experiment for bio-security
- (ACL ID: H91-1068) fast text processing for information retrieval
- (ACL ID: H93-1037) lingstat
- (ACL ID: H94-1029) the automatic component of the lingstat machine-aided translation system
- (ACL ID: H94-1050) weighted rational transductions and their application to human language processing
- (ACL ID: H94-1096) the automatic component of the lingstat machine-aided translation system
- (ACL ID: I05-2038) syntax annotation for the genia corpus
- (ACL ID: I05-5003) using machine translation evaluation techniques to determine sentence-level semantic equivalence
- (ACL ID: J01-2004) probabilistic top-down parsing and language modeling
- (ACL ID: J01-4004) a machine learning approach to coreference resolution of noun phrases
- (ACL ID: J02-2002) the combinatory morphemic lexicon
- (ACL ID: J03-3002) articles the web as a parallel corpus
- (ACL ID: J03-4001) dependency parsing with an extended finite-state approach
- (ACL ID: J05-3002) sentence fusion for multidocument news summarization
- (ACL ID: J92-1002) an estimate of an upper bound for the entropy of english
- (ACL ID: J94-3004) the reconstruction engine
- (ACL ID: J96-3004) a stochastic finite-state word-segmentation algorithm for chinese
- (ACL ID: J97-2002) adaptive multilingual sentence boundary disambiguation
- (ACL ID: J97-3001) a rule-based hyphenator for modern greek
- (ACL ID: J97-3003) automatic rule induction for unknown-word guessing
- (ACL ID: J97-4004) critical tokenization and its properties
- (ACL ID: M91-1023) gte
- (ACL ID: M91-1031) synchronetics
- (ACL ID: M92-1030) mitre-bedford
- (ACL ID: M93-1013) mitre-bedford
- (ACL ID: M93-1021) unisys
- (ACL ID: M95-1008) knight-ridder information's value adding name finder
- (ACL ID: M95-1009) lockheed martin
- (ACL ID: M95-1012) mitre
- (ACL ID: M95-1014) the nyu system for muc-6 or where's the syntax?
- (ACL ID: M95-1015) university of pennsylvania
- (ACL ID: M95-1016) description of the saic dx systemasused for muc-6
- (ACL ID: M98-1016) description of the kent ridge digital labs system used for muc-7
- (ACL ID: M98-1018) nyu
- (ACL ID: M98-1028) appendix e
- (ACL ID: M98-1029) appendix f
- (ACL ID: N01-1025) chunking with support vector machines
- (ACL ID: N01-1029) a structured language model based on context-sensitive probabilistic left-corner parsing
- (ACL ID: N03-1018) a generative probabilistic ocr model for nlp applications
- (ACL ID: N04-1008) automatic question answering
- (ACL ID: N04-1013) speed and accuracy in shallow and deep stochastic parsing
- (ACL ID: N04-2002) identifying chemical names in biomedical text
- (ACL ID: N04-2007) a preliminary look into the use of named entity information for bioscience text tokenization
- (ACL ID: N04-3008) senseclusters - finding clusters that represent word senses
- (ACL ID: N04-4026) a unigram orientation model for statistical machine translation
- (ACL ID: N04-4038) automatic tagging of arabic text
- (ACL ID: N06-1055) semantic role labeling of nominalized predicates in chinese
- (ACL ID: N06-2013) arabic preprocessing schemes for statistical machine translation
- (ACL ID: N06-2035) weblog classification for fast splog filtering
- (ACL ID: N06-2038) a comparison of tagging strategies for statistical information extraction
- (ACL ID: N06-4005) aqualog
- (ACL ID: P01-1002) invited talk
- (ACL ID: P01-1041) japanese named entity recognition based on a simple rule generator and decision tree learning
- (ACL ID: P01-1058) evaluating cetempúblico, a free resource for portuguese
- (ACL ID: P01-1069) text chunking using regularized winnow
- (ACL ID: P02-1056) an integrated archictecture for shallow and deep processing
- (ACL ID: P03-1011) loosely tree-based alignment for machine translation
- (ACL ID: P03-1015) combining deep and shallow approaches in parsing german
- (ACL ID: P03-1023) coreference resolution using competition learning approach
- (ACL ID: P03-1037) parametric models of linguistic count data
- (ACL ID: P03-2019) integrating information extraction and automatic hyperlinking
- (ACL ID: P04-1004) analysis of mixed natural and symbolic input in mathematical dialogs
- (ACL ID: P04-1021) a joint source-channel model for machine transliteration
- (ACL ID: P04-1025) extracting regulatory gene expression networks from pubmed
- (ACL ID: P04-1030) head-driven parsing for word lattices
- (ACL ID: P04-1057) error mining for wide-coverage grammar engineering
- (ACL ID: P04-3014) improving bitext word alignments via syntax-based reordering of english
- (ACL ID: P05-1046) unsupervised learning of field segmentation models for information extraction
- (ACL ID: P05-1052) extracting relations with integrated information using kernel methods
- (ACL ID: P05-1064) a phonotactic language model for spoken language identification
- (ACL ID: P05-1069) a localized prediction model for statistical machine translation
- (ACL ID: P05-1071) arabic tokenization, part-of-speech tagging and morphological disambiguation in one fell swoop
- (ACL ID: P05-1073) joint learning improves semantic role labeling
- (ACL ID: P05-1075) a nonparametric method for extraction of candidate phrasal terms
- (ACL ID: P05-3009) the linguist’s search engine
- (ACL ID: P06-1001) combination of arabic preprocessing schemes for statistical machine translation
- (ACL ID: P06-1015) espresso
- (ACL ID: P06-1016) modeling commonality among related classes in relation extraction
- (ACL ID: P06-1042) error mining in parsing results
- (ACL ID: P06-1078) incorporating speech recognition confidence into discriminative named entity recognition of speech data
- (ACL ID: P06-1090) a clustered global phrase reordering model for statistical machine translation
- (ACL ID: P06-1091) a discriminative global training algorithm for statistical mt
- (ACL ID: P06-1108) event extraction in a plot advice agent
- (ACL ID: P06-2005) a phrase-based statistical model for sms text normalization
- (ACL ID: P06-2006) evaluating the accuracy of an unlexicalized statistical parser on the parc depbank
- (ACL ID: P06-2024) towards a modular data model for multi-layer annotated corpora
- (ACL ID: P06-2058) obfuscating document stylometry to preserve author anonymity
- (ACL ID: P06-2105) a logic-based semantic approach to recognizing textual entailment
- (ACL ID: P06-2114) sinhala grapheme-to-phoneme conversion and rules for schwa epenthesis
- (ACL ID: P06-3008) discursive usage of six chinese punctuation marks
- (ACL ID: P06-4018) nltk
- (ACL ID: P89-1012) dictionaries, dictionary grammars and dictionary entry parsing
- (ACL ID: P92-1021) lattice-based word identification in clare
- (ACL ID: P94-1002) multi-paragraph segmentation expository text
- (ACL ID: P96-1015) directed replacement
- (ACL ID: P97-1039) a portable algorithm for mapping bitext correspondence
- (ACL ID: P97-1063) a word-to-word model of translational equivalence
- (ACL ID: P98-1030) terminology finite-state preprocessing for computational lfg
- (ACL ID: P98-1076) one tokenization per source
- (ACL ID: P98-2189) ranking text units according to textual saliency, connectivity and topic aptness
- (ACL ID: P99-1046) statistical models for topic segmentation
- (ACL ID: W00-0409) multi-document summarization by visualizing topical content
- (ACL ID: W00-0504) mandarin-english information (mei)
- (ACL ID: W00-0506) pre-processing closed captions for machine translation
- (ACL ID: W00-0602) some challenges of developing fully-automated systems for taking audio comprehension exams
- (ACL ID: W00-1103) use of dependency tree structures for the microcontext extraction
- (ACL ID: W00-1104) semantic indexing using wordnet senses
- (ACL ID: W00-1205) sinica treebank
- (ACL ID: W00-1217) how should a large corpus be built?-a comparative study of closure in annotated newspaper corpora from two chinese sources, towards building a larger representative corpus merged from representative sublanguage collections
- (ACL ID: W00-1302) what's yours and what's mine
- (ACL ID: W00-1320) a statistical model for parsing and word-sense disambiguation
- (ACL ID: W01-1011) gist-it
- (ACL ID: W01-1312) a multilingual approach to annotating and extracting temporal information
- (ACL ID: W01-1409) building a statistical machine translation system from scratch
- (ACL ID: W01-1412) a comparative study on translation units for bilingual lexicon extraction
- (ACL ID: W02-0606) unsupervised discovery of morphologically related words based on orthographic and semantic similarity
- (ACL ID: W02-0814) evaluating the results of a memory-based word-expert approach to unrestricted word sense disambiguation
- (ACL ID: W02-0903) boosting automatic lexical acquisition with morphological information
- (ACL ID: W02-1013) from words to corpora
- (ACL ID: W02-1031) the superarv language model
- (ACL ID: W02-1302) towards a road map on human language technology
- (ACL ID: W02-1506) adapting existing grammars
- (ACL ID: W02-1817) automatic recognition of chinese unknown words based on roles tagging
- (ACL ID: W03-0308) treq-al
- (ACL ID: W03-0504) summarization of noisy documents
- (ACL ID: W03-0801) the talent system
- (ACL ID: W03-0806) blueprint for a high performance nlp infrastructure
- (ACL ID: W03-0810) accelerating corporate research in the development, application, and deployment of human language technologies
- (ACL ID: W03-0909) surfaces and depths in text understanding
- (ACL ID: W03-1211) question answering on a case insensitive corpus
- (ACL ID: W03-1301) gene name extraction using flybase resources
- (ACL ID: W03-1309) protein name tagging for biomedical annotation in text
- (ACL ID: W03-1315) an investigation of various information sources for classifying biological names
- (ACL ID: W03-1606) normalization and paraphrasing using symbolic methods
- (ACL ID: W03-1701) unsupervised training for overlapping ambiguity resolution in chinese word segmentation
- (ACL ID: W03-1731) chunking-based chinese word tokenization
- (ACL ID: W03-2008) natural language analysis of patent claims
- (ACL ID: W04-0407) representation and treatment of multiword expressions in basque
- (ACL ID: W04-0409) integrating morphology with multi-word expression processing in turkish
- (ACL ID: W04-0804) senseval-3 task
- (ACL ID: W04-0810) a first evaluation of logic form identification systems
- (ACL ID: W04-0821) the university of maryland senseval-3 system descriptions
- (ACL ID: W04-0850) the duluth lexical sample systems in senseval-3
- (ACL ID: W04-1111) a statistical model for hangeul-hanja conversion in terminology domain
- (ACL ID: W04-1221) biomedical named entity recognition using conditional random fields and rich feature sets
- (ACL ID: W04-1601) computer processing of arabic script-based languages. current state and future directions
- (ACL ID: W04-1602) developing an arabic treebank
- (ACL ID: W04-1606) issues in arabic orthography and morphology analysis
- (ACL ID: W04-1609) an unsupervised approach for bootstrapping arabic sense tagging
- (ACL ID: W04-1703) nlp-based scripting for call activities
- (ACL ID: W04-1806) automatically inducing ontologies from corpora
- (ACL ID: W04-2602) towards full automation of lexicon construction
- (ACL ID: W04-2710) annotating wordnet
- (ACL ID: W04-3101) a resource for constructing customized test suites for molecular biology entity identification systems
- (ACL ID: W04-3111) integrated annotation for biomedical information extraction
- (ACL ID: W04-3217) automatic analysis of plot for story rewriting
- (ACL ID: W04-3238) spelling correction as an iterative process that exploits the collective knowledge of web users
- (ACL ID: W04-3245) from machine translation to computer assisted translation using finite-state models
- (ACL ID: W05-0101) teaching applied natural language processing
- (ACL ID: W05-0110) teaching language technology at the north-west university
- (ACL ID: W05-0111) hands-on nlp for an interdisciplinary audience
- (ACL ID: W05-0304) parallel entity and treebank annotation
- (ACL ID: W05-0603) search engine statistics beyond the n-gram
- (ACL ID: W05-0705) modifying a natural language processing system for european languages to treat arabic in information processing and information retrieval applications
- (ACL ID: W05-0706) choosing an optimal architecture for segmentation and pos-tagging of modern hebrew
- (ACL ID: W05-0708) pos tagging of dialectal arabic
- (ACL ID: W05-0804) bilingual word spectral clustering for statistical machine translation
- (ACL ID: W05-0822) portage
- (ACL ID: W05-0826) combining linguistic data views for phrase-based smt
- (ACL ID: W05-0903) preprocessing and normalization for automatic evaluation of machine translation
- (ACL ID: W05-1306) corpus design for biomedical natural language processing
- (ACL ID: W06-0115) the third international chinese language processing bakeoff
- (ACL ID: W06-0117) france telecom r&d beijing word segmenter for sighan bakeoff 2006
- (ACL ID: W06-1008) a fast and accurate method for detecting english-japanese parallel texts
- (ACL ID: W06-1104) automatically creating datasets for measures of semantic relatedness
- (ACL ID: W06-1317) classification of discourse coherence relations
- (ACL ID: W06-1501) the hidden tag model
- (ACL ID: W06-1650) automatically assessing review helpfulness
- (ACL ID: W06-1658) entity annotation based on inverse index operations
- (ACL ID: W06-1708) the problem of ontology alignment on the web
- (ACL ID: W06-1905) keyword translation accuracy and cross-lingual question answering inchinese and japanese
- (ACL ID: W06-1910) experiments adapting an open-domain question answering system to the geographical domain using scope-based resources
- (ACL ID: W06-2205) recognition of synonyms by a lexical graph
- (ACL ID: W06-2701) representing and querying multi-dimensional markup for question answering
- (ACL ID: W06-2702) annotation and disambiguation of semantic types in biomedical text
- (ACL ID: W06-2713) representing and accessing multilevel linguistic annotation using the meaning format
- (ACL ID: W06-2716) layering and merging linguistic annotations
- (ACL ID: W06-2914) word distributions for thematic segmentation in a support vector machine approach
- (ACL ID: W06-2920) conll-x shared task on multilingual dependency parsing
- (ACL ID: W06-3108) discriminative reordering models for statistical machine translation
- (ACL ID: W06-3114) manual and automatic evaluation of machine translation between european languages
- (ACL ID: W06-3115) ntt system description for the wmt2006 shared task
- (ACL ID: W06-3118) portage
- (ACL ID: W06-3303) term generalization and synonym resolution for biological abstracts
- (ACL ID: W06-3306) human gene name normalization using text matching with automatically extracted synonym dictionaries
- (ACL ID: W06-3312) postnominal prepositional phrase attachment in proteomics
- (ACL ID: W06-3328) bootstrapping and evaluating named entity recognition in the biomedical domain
- (ACL ID: W06-3711) ibm mastor system
- (ACL ID: W06-3812) chinese whispers - an efficient graph clustering algorithm and its application to natural language processing problems
- (ACL ID: W93-0109) the automatic acquisition of frequencies of verb subcategorization frames from tagged corpora
- (ACL ID: W95-0114) compiling bilingual lexicon entries from a non-parallel english-chinese corpus
- (ACL ID: W96-0101) using word class for part-of-speech disambiguation
- (ACL ID: W97-0102) commercial implementation of text recognition tools for vlc
- (ACL ID: W97-0312) learning to tag multilingual texts through observation
- (ACL ID: W97-0502) automatic message indexing and full text retrieval for a communication aid
- (ACL ID: W97-0909) nlp and industry
- (ACL ID: W97-1008) what makes a word
- (ACL ID: W97-1015) a comparative study of the application of different learning techniques to natural language interfaces
- (ACL ID: W97-1508) lexical resource reconciliation in the xerox linguistic environment
- (ACL ID: W98-0203) coreference as the foundations for link analysis over free text databases
- (ACL ID: W98-0211) how to build a (quite general) linguistic diagram editor
- (ACL ID: W98-1002) tagarab
- (ACL ID: W98-1102) encoding linguistic corpora
- (ACL ID: W98-1118) exploiting diverse knowledge sources via maximum entropy in named entity recognition
- (ACL ID: W99-0313) a two-level approach to coding dialogue for discourse structure
- (ACL ID: W99-0605) cross-language information retrieval for technical documents
- (ACL ID: W99-0612) language independent named entity recognition combining morphological and contextual evidence
- (ACL ID: W99-0634) corpus-based learning for noun phrase coreference resolution
- (ACL ID: X96-1043) tipster text phase ii architecture design version 2.1p 19 june 1996
- (ACL ID: X98-1004) the common pattern specification language
- (ACL ID: X98-1010) coreference resolution strategies from an application perspective
- (ACL ID: X98-1012) research in information extraction
- (ACL ID: X98-1013) information extraction research and applications
- (ACL ID: X98-1021) overview of the university of pennsylvania's tipster project
- (ACL ID: X98-1023) improving robust domain independent summarization
- (ACL ID: X98-1026) automated text summarization and the summarist system
* See also a list of some of the related terms to tokenization.
Back to Description Index