In this paper we show how two standard outputs from <term> information extraction ( IE ) systems </term> - <term> named entity annotations </term> and <term> scenario templates </term> - can be used to enhance access to <term> text collections </term> via a standard <term> text browser </term> .
The <term> CCLINC Korean-to-English translation system </term> consists of two <term> core modules </term> , <term> language understanding and generation modules </term> mediated by a <term> language neutral meaning representation </term> called a <term> semantic frame </term> .
We reconceptualize the task into two distinct phases .
Over two distinct <term> datasets </term> , we find that <term> indexing </term> according to simple <term> character bigrams </term> produces a <term> retrieval accuracy </term> superior to any of the tested <term> word N-gram models </term> .
In order to perform an exhaustive comparison , we also evaluate a <term> hand-crafted template-based generation component </term> , two <term> rule-based sentence planners </term> , and two <term> baseline sentence planners </term> .
In order to perform an exhaustive comparison , we also evaluate a <term> hand-crafted template-based generation component </term> , two <term> rule-based sentence planners </term> , and two <term> baseline sentence planners </term> .
The two <term> evaluation measures </term> of the <term> BLEU score </term> and the <term> NIST score </term> demonstrated the effect of using an out-of-domain <term> bilingual corpus </term> and the possibility of using the <term> language model </term> .
We evaluate the utility of this <term> constraint </term> in two different <term> algorithms </term> .
The <term> bootstrapping procedure </term> is implemented as training two <term> successive learners </term> .
<term> FSM </term> provides two strategies for <term> language understanding </term> and have a high accuracy but little robustness and flexibility .
On a subset of the most difficult <term> SENSEVAL-2 nouns </term> , the <term> accuracy </term> difference between the two approaches is only 14.0 % , and the difference could narrow further to 6.5 % if we disregard the advantage that <term> manually sense-tagged data </term> have in their <term> sense coverage </term> .
We then use the <term> predicates </term> of such <term> clauses </term> to create a set of <term> domain independent features </term> to annotate an input <term> dataset </term> , and run two different <term> machine learning algorithms </term> : <term> SLIPPER </term> , a <term> rule-based learning algorithm </term> , and <term> TiMBL </term> , a <term> memory-based system </term> .
We tested the <term> clustering and filtering processes </term> on <term> electronic newsgroup discussions </term> , and evaluated their performance by means of two experiments : coarse-level <term> clustering </term> and simple <term> information retrieval </term> .
In this paper , a novel framework for <term> machine transliteration/back transliteration </term> that allows us to carry out <term> direct orthographical mapping ( DOM ) </term> between two different <term> languages </term> is presented .
We give two estimates , a lower one and a higher one .
The correlation of the new <term> measure </term> with <term> human judgment </term> has been investigated systematically on two different <term> language pairs </term> .
We extend prior work in two ways .
Examination of the effect of <term> features </term> shows that <term> predicting top-level and predicting subtopic boundaries </term> are two distinct tasks : ( 1 ) for predicting <term> subtopic boundaries </term> , the <term> lexical cohesion-based approach </term> alone can achieve competitive results , ( 2 ) for <term> predicting top-level boundaries </term> , the <term> machine learning approach </term> that combines <term> lexical-cohesion and conversational features </term> performs best , and ( 3 ) <term> conversational cues </term> , such as <term> cue phrases </term> and <term> overlapping speech </term> , are better indicators for the top-level prediction task .
We also find that the <term> transcription errors </term> inevitable in <term> ASR output </term> have a negative impact on models that combine <term> lexical-cohesion and conversational features </term> , but do not change the general preference of approach for the two tasks .
This paper discusses two problems that arise in the <term> Generation of Referring Expressions </term> : ( a ) <term> numeric-valued attributes </term> , such as size or location ; ( b ) <term> perspective-taking in reference </term> .
hide detail