tech,5-1-H05-1117,ak Following recent developments in the <term> automatic evaluation </term> of <term> machine translation </term> and <term> document summarization </term> , we present a similar approach , implemented in a measure called <term> POURPRE </term> , for automatically evaluating answers to <term> definition questions </term> .
tech,3-3-H05-1117,ak The lack of <term> automatic methods for scoring system output </term> is an impediment to progress in the field , which we address with this work .
other,32-1-H05-1117,ak Following recent developments in the <term> automatic evaluation </term> of <term> machine translation </term> and <term> document summarization </term> , we present a similar approach , implemented in a measure called <term> POURPRE </term> , for automatically evaluating answers to <term> definition questions </term> .
tech,11-1-H05-1117,ak Following recent developments in the <term> automatic evaluation </term> of <term> machine translation </term> and <term> document summarization </term> , we present a similar approach , implemented in a measure called <term> POURPRE </term> , for automatically evaluating answers to <term> definition questions </term> .
other,21-2-H05-1117,ak Until now , the only way to assess the correctness of answers to such questions involves manual determination of whether an <term> information nugget </term> appears in a system 's response .
tech,8-1-H05-1117,ak Following recent developments in the <term> automatic evaluation </term> of <term> machine translation </term> and <term> document summarization </term> , we present a similar approach , implemented in a measure called <term> POURPRE </term> , for automatically evaluating answers to <term> definition questions </term> .
measure(ment),25-1-H05-1117,ak Following recent developments in the <term> automatic evaluation </term> of <term> machine translation </term> and <term> document summarization </term> , we present a similar approach , implemented in a measure called <term> POURPRE </term> , for automatically evaluating answers to <term> definition questions </term> .
measure(ment),25-4-H05-1117,ak Experiments with the <term> TREC 2003 and TREC 2004 QA tracks </term> indicate that <term> rankings </term> produced by our metric correlate highly with official <term> rankings </term> , and that <term> POURPRE </term> outperforms direct application of existing metrics .
other,12-4-H05-1117,ak Experiments with the <term> TREC 2003 and TREC 2004 QA tracks </term> indicate that <term> rankings </term> produced by our metric correlate highly with official <term> rankings </term> , and that <term> POURPRE </term> outperforms direct application of existing metrics .
other,21-4-H05-1117,ak Experiments with the <term> TREC 2003 and TREC 2004 QA tracks </term> indicate that <term> rankings </term> produced by our metric correlate highly with official <term> rankings </term> , and that <term> POURPRE </term> outperforms direct application of existing metrics .
other,3-4-H05-1117,ak Experiments with the <term> TREC 2003 and TREC 2004 QA tracks </term> indicate that <term> rankings </term> produced by our metric correlate highly with official <term> rankings </term> , and that <term> POURPRE </term> outperforms direct application of existing metrics .
hide detail