ACL RD-TEC 1.0 Summarization of W05-0908

Paper Title:
ON SOME PITFALLS IN AUTOMATIC EVALUATION AND SIGNIFICANCE TESTING FOR MT

Authors: Stefan Riezler and John T. Maxwell

Other assigned terms:

  • approximate randomization
  • argumentation
  • association for computational linguistics
  • baseline model
  • benchmark
  • bigram
  • bleu
  • bleu score
  • case
  • coefficient
  • dependency relations
  • development set
  • distribution
  • error rate
  • estimation
  • evaluation measure
  • evaluation measures
  • evaluation metric
  • evaluation metrics
  • evaluation task
  • evaluations
  • extrinsic evaluation measures
  • f-score
  • fact
  • feature
  • feature sets
  • grammatical relations
  • hypothesis
  • hypothesis test
  • inferences
  • knowledge
  • language model
  • lexical choice
  • likelihood
  • linguistics
  • log-likelihood
  • log-linear model
  • meaning
  • measure
  • measures
  • method
  • mt evaluation
  • n-gram
  • n-grams
  • natural language
  • nist
  • null hypothesis
  • optimization criterion
  • order variation
  • parallel corpus
  • parameter values
  • parse
  • phrase
  • phrase-based system
  • precision
  • probability
  • procedure
  • reference translation
  • reference translations
  • relation
  • semantic
  • sentence
  • sentences
  • similarity measures
  • statistic
  • statistical significance
  • statistics
  • structural information
  • system development
  • technique
  • technologies
  • term
  • test corpus
  • test data
  • test set
  • textbook
  • training
  • training and test data
  • training data
  • training set
  • translation quality
  • translational adequacy
  • translations
  • trigram
  • word
  • word order
  • word order variation
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***