ACL RD-TEC 1.0 Summarization of I05-2039

Paper Title:
THE INFLUENCE OF DATA HOMOGENEITY ON NLP SYSTEM PERFORMANCE

Other assigned terms:

  • approach
  • arithmetic mean
  • array
  • bleu
  • case
  • characters
  • coefficient
  • community
  • contemporary english
  • conversation
  • corpora
  • correlation
  • data homogeneity
  • dialogues
  • distribution
  • edit distance
  • error rate
  • estimation
  • evaluation measures
  • fact
  • frequency counts
  • geometric mean
  • grammars
  • japanese language
  • japanese sentences
  • knowledge
  • language model
  • language model perplexity
  • language models
  • large corpora
  • large corpus
  • lexeme
  • lexemes
  • measure
  • measures
  • method
  • model complexity
  • model perplexity
  • mt quality
  • multilingual corpus
  • n-gram
  • n-gram model
  • nist
  • nlp community
  • objective translation
  • perplexity
  • probabilistic models
  • reference translation
  • reference translations
  • semantic
  • sentence
  • sentences
  • signal
  • similarity scores
  • speech database
  • standard deviation
  • style
  • sublanguage
  • system performance
  • target language
  • terms
  • test corpus
  • test set
  • text
  • training
  • training corpus
  • training data
  • transcriptions
  • transcripts
  • translation quality
  • translations
  • trees
  • user
  • word
  • word error rate
  • word frequency
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***