ACL RD-TEC 1.0 Summarization of W96-0109
Paper Title:
EXPLOITING TEXT STRUCTURE FOR TOPIC IDENTIFICATION
EXPLOITING TEXT STRUCTURE FOR TOPIC IDENTIFICATION
Authors: Tadashi Nomoto and Yuji Matsumoto
Primarily assigned technology terms:
- anaphora resolution
- categorization
- classification
- computing
- cutoff
- emacs
- full-text retrieval
- identification
- information retrieval
- learning
- measuring
- normalization
- passage retrieval
- probabilistic thresholding
- retrieval system
- scoring
- statistical analysis
- terminology
- text categorization
- thresholding
- tile
- tokenizer
- topic identification
- weighting
Other assigned terms:
- adjunct
- anaphora
- approach
- bigram
- break
- case
- characters
- computational linguists
- concepts
- discourse
- discourse unit
- distribution
- document
- document frequency
- document similarity
- english translation
- fact
- formal structure
- french
- identification task
- implementation
- index
- inverse document frequency
- likelihood
- likelihood function
- likelihood value ~
- linguists
- measure
- method
- newspaper corpus
- norm
- normalization factor
- nouns
- nucleus
- paragraph
- paragraphs
- passage
- phrase
- precision
- priori
- probabilistic approach
- probabilistic model
- probabilities
- probability
- procedure
- process
- query
- relation
- relativization
- retrieval performance
- rhetorical structure
- rhetorical structure theory
- segments
- semantic
- sentence
- sentences
- similarity function
- similarity measure
- stylistic norm
- technique
- term
- term frequency
- terms
- test corpus
- test set
- text
- text length
- text segment
- text segments
- text structure
- textual units
- theorem
- theory
- tokens
- topic identification task
- topics
- training
- training corpus
- training set
- trigram
- user
- vocabulary
- word
- words