ACL RD-TEC 1.0 Summarization of W06-1657
Paper Title:
SHORT TEXT AUTHORSHIP ATTRIBUTION VIA SEQUENCE KERNELS, MARKOV CHAINS AND AUTHOR UNMASKING: AN INVESTIGATION
SHORT TEXT AUTHORSHIP ATTRIBUTION VIA SEQUENCE KERNELS, MARKOV CHAINS AND AUTHOR UNMASKING: AN INVESTIGATION
Authors: Conrad Sanderson and Simon Guenter
Primarily assigned technology terms:
- author unmasking
- bag-of-words kernel
- cancer classification
- character sequence kernel
- classification
- classifier
- combined training
- computational linguistics
- cross-validation
- database
- editing
- feature elimination
- identification
- kernel
- kernels
- language modelling
- language processing
- linear interpolation
- matching
- maximum likelihood
- modelling
- natural language processing
- partial matching
- processing
- quadratic programming
- sequence kernel
- smoothing
- smoothing techniques
- splitting
- support vector machines
- svm discrimination
- unmasking procedure
- word stemming
- word-based kernel
Other assigned terms:
- approach
- association for computational linguistics
- authorship
- authorship attribution
- bias
- case
- character sequence
- characters
- chunk
- chunks
- dimensionality
- distribution
- error rate
- evaluations
- feature
- feature space
- generalisation
- hypothesis
- interpolation
- interpretation
- joint probability
- kernel function
- knowledge
- likelihood
- likelihood ratio
- linguistics
- maps
- markov chain
- method
- natural language
- opinion
- probabilities
- probability
- probability estimate
- procedure
- sentences
- sparse feature space
- standard deviation
- stems
- style
- support vector
- svms
- symbols
- technique
- term
- terms
- test material
- text
- topics
- training
- training dataset
- training material
- uniform distribution
- vocabulary
- vocabulary size
- word
- word sequence
- word sequences
- words