Traditional <term> information retrieval techniques </term> use a <term> histogram </term> of <term> keywords </term> as the <term> document representation </term> but <term> oral communication </term> may offer additional <term> indices </term> such as the time and place of the rejoinder and the attendance .
An alternative <term> index </term> could be the activity such as discussing , planning , informing , story-telling , etc .
<term> Emotions </term> and other <term> indices </term> such as the <term> dominance distribution of speakers </term> might be available on the <term> surface </term> and could be used directly .
other,23-1-P01-1009,bq This paper presents a <term> formal analysis </term> for a large class of <term> words </term> called <term> alternative markers </term> , which includes <term> other ( than ) </term> , <term> such ( as ) </term> , and <term> besides </term> .
We then use the <term> predicates </term> of such <term> clauses </term> to create a set of <term> domain independent features </term> to annotate an input <term> dataset </term> , and run two different <term> machine learning algorithms </term> : <term> SLIPPER </term> , a <term> rule-based learning algorithm </term> , and <term> TiMBL </term> , a <term> memory-based system </term> .
However , such an approach does not work well when there is no distinctive <term> attribute </term> among <term> objects </term> .
We conducted <term> psychological experiments </term> with 42 subjects to collect <term> referring expressions </term> in such situations , and built a <term> generation algorithm </term> based on the results .
<term> Information extraction techniques </term> automatically create <term> structured databases </term> from <term> unstructured data sources </term> , such as the <term> Web </term> or <term> newswire documents </term> .
<term> Topic signatures </term> can be useful in a number of <term> Natural Language Processing ( NLP ) </term> applications , such as <term> Word Sense Disambiguation ( WSD ) </term> and <term> Text Summarisation </term> .
We demonstrate how errors in the <term> machine translations </term> of the input <term> Arabic documents </term> can be corrected by identifying and generating from such <term> redundancy </term> , focusing on <term> noun phrases </term> .
A <term> method </term> for producing such <term> phrases </term> from a <term> word-aligned corpora </term> is proposed .
A <term> statistical translation model </term> is also presented that deals such <term> phrases </term> , as well as a <term> training method </term> based on the maximization of <term> translation accuracy </term> , as measured with the <term> NIST evaluation metric </term> .
Until now , the only way to assess the correctness of answers to such questions involves manual determination of whether an information nugget appears in a system 's response .
Automatic <term> evaluation metrics </term> for <term> Machine Translation ( MT ) systems </term> , such as <term> BLEU </term> or <term> NIST </term> , are now well established .
We first introduce our <term> approach </term> to inducing such a <term> grammar </term> from <term> parallel corpora </term> .
In many cases though such movements still result in correct or almost correct <term> sentences </term> .
Examination of the effect of <term> features </term> shows that <term> predicting top-level and predicting subtopic boundaries </term> are two distinct tasks : ( 1 ) for predicting <term> subtopic boundaries </term> , the <term> lexical cohesion-based approach </term> alone can achieve competitive results , ( 2 ) for <term> predicting top-level boundaries </term> , the <term> machine learning approach </term> that combines <term> lexical-cohesion and conversational features </term> performs best , and ( 3 ) <term> conversational cues </term> , such as <term> cue phrases </term> and <term> overlapping speech </term> , are better indicators for the top-level prediction task .
This paper discusses two problems that arise in the <term> Generation of Referring Expressions </term> : ( a ) <term> numeric-valued attributes </term> , such as size or location ; ( b ) <term> perspective-taking in reference </term> .
Finding the preferred <term> language </term> for such a <term> need </term> is a valuable task .
This formalism is both elementary and powerful enough to strongly simulate many <term> grammar formalisms </term> , such as <term> rewriting systems </term> , <term> dependency grammars </term> , <term> TAG </term> , <term> HPSG </term> and <term> LFG </term> .
hide detail