ACL RD-TEC 1.0 Summarization of A94-1003
Paper Title:
LANGUAGE DETERMINATION: NATURAL LANGUAGE PROCESSING FROM SCANNED DOCUMENT IMAGES
LANGUAGE DETERMINATION: NATURAL LANGUAGE PROCESSING FROM SCANNED DOCUMENT IMAGES
Authors: Penelope Sibun and A. Lawrence Spitz
Primarily assigned technology terms:
- accurate comparison
- categorization
- character recognition
- classification
- classification trees
- coding
- content analysis
- content characterization
- cross validation
- cross-validation
- database
- discriminant analysis
- discriminate analysis
- document indexing
- encoding
- identification
- indexing
- information retrieval
- language determination
- language identification
- language processing
- language processing systems
- language understanding
- machine translation
- morphology
- natural language processing
- natural language processing systems
- natural language understanding
- neural networks
- nlp
- optical character recognition
- processing
- ranking
- recognition
- recognizer
- statistical methods
- statistical technique
- translators
- validation
Other assigned terms:
- accent
- alphabet
- approach
- case
- characters
- classification accuracy
- cluster
- computational complexity
- computational linguists
- confusion matrix
- corpora
- croatian
- culture
- determiners
- distribution
- document
- dutch
- english corpus
- english lexicon
- fact
- feature
- french
- german corpus
- german text
- lexicon
- linguists
- mapping
- mappings
- maps
- method
- methodology
- n-grams
- natural language
- nlp tasks
- norwegian
- patent
- procedure
- process
- pronouns
- representations
- roman alphabet
- sentences
- size of the corpus
- sources of information
- statistical model
- statistical models
- statistics
- style
- swahili
- syntax
- technique
- technology
- terms
- test data
- text
- tokens
- topics
- training
- training corpus
- training set
- transformation
- trees
- understanding
- vowel
- word
- words