ACL RD-TEC 1.0 Summarization of W06-2206

Paper Title:
SPOTTING THE `ODD-ONE-OUT': DATA-DRIVEN ERROR DETECTION AND CORRECTION IN TEXTUAL DATABASES

Authors: Caroline Sporleder and Marieke van Erp and Tijn Porcelijn and Antal van den Bosch

Other assigned terms:

  • abbreviations
  • annotation
  • annotator
  • approach
  • background knowledge
  • bibliographical information
  • case
  • classification problem
  • classification task
  • clusters
  • community
  • content words
  • data set
  • data sets
  • database record
  • development set
  • document
  • document frequency
  • dutch
  • fact
  • feature
  • feature set
  • feature vectors
  • french
  • function words
  • human annotator
  • information gain
  • inverse document frequency
  • knowledge
  • likelihood
  • manual annotation
  • method
  • names
  • noun phrases
  • parameter settings
  • person names
  • precision
  • prediction accuracy
  • prepositions
  • probability
  • probability distributions
  • process
  • proper names
  • proper noun
  • punctuation
  • query
  • rule set
  • similarity metric
  • stem
  • synonyms
  • taxonomy
  • term
  • term frequency
  • terms
  • test set
  • text
  • text classification task
  • textual information
  • theory
  • tokens
  • training
  • training data
  • training set
  • tree
  • uniform probability
  • user
  • word
  • word lists
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***