ACL RD-TEC 1.0 Summarization of C96-2110

Paper Title:
IDENTIFYING THE CODING SYSTEM AND LANGUAGE OF ON-LINE DOCUMENTS ON THE INTERNET

Other assigned terms:

  • ambiguity
  • approach
  • case
  • characters
  • class name
  • community
  • corpora
  • dictionaries
  • document
  • encoding scheme
  • fact
  • heuristic
  • heuristic rules
  • heuristics
  • information infrastructure
  • language models
  • likelihood
  • mapping
  • maps
  • message
  • method
  • names
  • probability
  • procedure
  • sentence
  • statistic
  • text
  • text corpora
  • tokens
  • unigram
  • unigram model
  • unigram probability
  • word
  • word boundaries
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***