ACL RD-TEC 1.0 Summarization of E99-1019
Paper Title:
EXPLORING THE USE OF LINGUISTIC FEATURES IN DOMAIN AND GENRE CLASSIFICATION
EXPLORING THE USE OF LINGUISTIC FEATURES IN DOMAIN AND GENRE CLASSIFICATION
Authors: Maria Wolters and Mathias Kirsten
Primarily assigned technology terms:
- algorithm
- binary classification
- categorisation
- categorization
- classification
- classification algorithm
- classification trees
- classifier
- classifiers
- comparative analysis
- computing
- content analysis
- corresponding training
- cross validation
- encoding
- feature selection
- feature weighting
- genre analysis
- genre classification
- genre detection
- information retrieval
- k-nn
- k-nn classification
- learning
- learning algorithms
- nearest-neighbour classifier
- parsers
- partitioning
- recursive partitioning
- search
- searching
- splitting
- statistical method
- text categorisation
- text classification
- validation
- weighting
Other assigned terms:
- approach
- array
- authorship
- authorship attribution
- auxiliary verbs
- benchmark
- brown corpus
- case
- categorisation research
- category label
- characters
- classification accuracy
- classification task
- classification tasks
- codebook
- community
- composition
- content words
- convergence
- corpora
- data set
- data sets
- determiners
- dimensionality
- discourse
- distance measure
- distance metric
- distribution
- document
- events
- fact
- feature
- feature set
- feature sets
- feature type
- feature value
- feature vector
- feature vectors
- fixed word order
- function word
- function word lemmata
- function words
- genre
- hypotheses
- index
- information gain
- leaf
- lemma
- lemmata
- linguistic
- linguistic features
- linguistic information
- linguistics
- logic
- mathematics
- meaning
- measure
- method
- negation
- noise
- noun phrases
- nouns
- opinion
- parameter settings
- part-of-speech
- particle
- passive voice
- personal pronouns
- pos information
- pos tag
- positive and negative examples
- precision
- procedure
- pronoun
- pronouns
- punctuation
- punctuation information
- punctuation marks
- relative frequency
- representations
- representative corpora
- semantic
- semantic classes
- semantic features
- sentence
- sentences
- sparse data
- standard deviation
- stems
- stop word list
- style
- syntactic relations
- tags
- tagset
- technology
- term
- terms
- test set
- text
- text type
- theory
- training
- training corpus
- training data
- training material
- training set
- training text
- tree
- tree node
- trees
- type theory
- unannotated text
- word
- word classes
- word features
- word frequencies
- word information
- word lemmata
- word lists
- word order
- word vector
- words