tech,5-1-H05-1117,ak Following recent developments in the <term> automatic evaluation </term> of <term> machine translation </term>
measure(ment),25-1-H05-1117,ak , implemented in a measure called <term> POURPRE </term> , for automatically evaluating answers
other,32-1-H05-1117,ak automatically evaluating answers to <term> definition questions </term> . Until now , the only way to assess
tech,3-3-H05-1117,ak a system 's response . The lack of <term> automatic methods for scoring system output </term> is an impediment to progress in the
other,3-4-H05-1117,ak with this work . Experiments with the <term> TREC 2003 and TREC 2004 QA tracks </term> indicate that <term> rankings </term>
other,12-4-H05-1117,ak 2004 QA tracks </term> indicate that <term> rankings </term> produced by our metric correlate
other,21-4-H05-1117,ak metric correlate highly with official <term> rankings </term> , and that <term> POURPRE </term> outperforms
measure(ment),25-4-H05-1117,ak official <term> rankings </term> , and that <term> POURPRE </term> outperforms direct application of
hide detail