ACL RD-TEC 1.0 Summarization of W03-1804
Paper Title:
USING MASKS, SUFFIX ARRAY-BASED DATA STRUCTURES AND MULTIDIMENSIONAL ARRAYS TO COMPUTE POSITIONAL NGRAM STATISTICS FROM CORPORA
USING MASKS, SUFFIX ARRAY-BASED DATA STRUCTURES AND MULTIDIMENSIONAL ARRAYS TO COMPUTE POSITIONAL NGRAM STATISTICS FROM CORPORA
Authors: Alexandre Gil and Gaël Dias
Primarily assigned technology terms:
Other assigned terms:
- approach
- array
- association measure
- case
- collocation
- concept
- context window
- corpora
- corpus size
- data structure
- data structures
- document
- extraction process
- fact
- generation
- graphical representation
- implementation
- large corpora
- lexical relations
- lexical unit
- linear time
- measure
- method
- ngram
- ngram model
- process
- relative frequency
- representations
- size of the corpus
- statistics
- substring
- suffix
- symbol
- term
- terms
- time complexity
- tokens
- vocabulary
- vocabulary size
- word
- word corpus
- words