ACL RD-TEC 1.0 Summarization of J00-4001
Paper Title:
AUTOMATIC TEXT CATEGORIZATION IN TERMS OF GENRE AND AUTHOR
AUTOMATIC TEXT CATEGORIZATION IN TERMS OF GENRE AND AUTHOR
Authors: Efstathios Stamatatoso and George Kokkinakis and Nikos Fakotakis
Primarily assigned technology terms:
- author identification
- authorship identification
- automatic categorization
- automatic text categorization
- boundary detection
- categorization
- classification
- components analysis
- computational linguistics
- computational system
- deep understanding
- disambiguation
- disambiguation method
- disambiguation procedure
- factor analysis
- genre detection
- grouping
- identification
- information extraction
- information retrieval
- language processing
- multiple-pass parsing
- natural language processing
- nlp
- nlp systems
- nominalization
- parser
- parsers
- parsing
- phrase detection
- preprocessing
- processing
- processing tools
- sampling
- sentence boundary detection
- statistical methods
- statistical techniques
- stylistic analysis
- syntactic annotation
- taggers
- terminology
- text analysis
- text categorization
- text categorization system
- text genre detection
- text preprocessing
- text processing
- verb phrases
- world wide web
Other assigned terms:
- abbreviation
- ambiguity
- annotated corpus
- annotation
- approach
- authorship
- authorship attribution
- chunk
- chunks
- connectionist
- corpora
- distribution
- electronic form
- ellipsis
- feature
- function words
- genre
- hapax legomena
- idiomatic expressions
- input text
- keyword
- knowledge
- lexical items
- lexicon
- linguistic
- linguistic theories
- linguistics
- measure
- measures
- method
- methodology
- morphological ambiguity
- names
- natural language
- nlp applications
- noun phrases
- parse
- parse tree
- part-of-speech
- part-of-speech tags
- phrase
- prepositional phrases
- prepositions
- procedure
- proper names
- propositional content
- punctuation
- punctuation marks
- rewrite rules
- sentence
- sentence boundaries
- sentence boundary
- sentences
- set size
- style
- stylistic information
- suffix
- suffixes
- syllables
- syntactic categories
- syntactic information
- tags
- technique
- terms
- text
- text genre
- text length
- theories
- tokens
- training
- training set
- training set size
- tree
- trees
- understanding
- verb
- vocabulary
- word
- word count
- word frequencies
- words