<term> Emotions </term> and other <term> indices </term> such as the <term> dominance distribution of speakers </term> might be available on the <term> surface </term> and could be used directly .
Despite the small size of the <term> databases </term> used some results about the effectiveness of these <term> indices </term> can be obtained .
In this paper we show how two standard outputs from <term> information extraction ( IE ) systems </term> - <term> named entity annotations </term> and <term> scenario templates </term> - can be used to enhance access to <term> text collections </term> via a standard <term> text browser </term> .
We describe how this information is used in a <term> prototype system </term> designed to support <term> information workers </term> ' access to a <term> pharmaceutical news archive </term> as part of their <term> industry watch </term> function .
The theoretical study of the <term> range concatenation grammar [ RCG ] formalism </term> has revealed many attractive properties which may be used in <term> NLP </term> .
In our method , <term> unsupervised training </term> is first used to train a <term> phone n-gram model </term> for a particular <term> domain </term> ; the <term> output </term> of <term> recognition </term> with this <term> model </term> is then passed to a <term> phone-string classifier </term> .
First , <term> decision list </term> is used to learn the <term> parsing-based NE rules </term> .
<term> Monolingual , unannotated text </term> can be used to further improve the <term> stemmer </term> by allowing it to adapt to a desired <term> domain </term> or <term> genre </term> .
We believe this is a state-of-the-art performance and the <term> algorithm </term> can be used for many <term> highly inflected languages </term> provided that one can create a small <term> manually segmented corpus </term> of the <term> language </term> of interest .
In this paper , we evaluate an approach to automatically acquire <term> sense-tagged training data </term> from <term> English-Chinese parallel corpora </term> , which are then used for disambiguating the <term> nouns </term> in the <term> SENSEVAL-2 English lexical sample task </term> .
We show that various <term> features </term> based on the structure of <term> email-threads </term> can be used to improve upon <term> lexical similarity </term> of <term> discourse segments </term> for <term> question-answer pairing </term> .
The same system used in a <term> validation mode </term> , can be used to check and spot <term> alignment errors </term> in <term> multilingually aligned wordnets </term> as <term> BalkaNet </term> and <term> EuroWordNet </term> .
The same system used in a <term> validation mode </term> , can be used to check and spot <term> alignment errors </term> in <term> multilingually aligned wordnets </term> as <term> BalkaNet </term> and <term> EuroWordNet </term> .
Our results show that <term> MBR decoding </term> can be used to tune <term> statistical MT </term> performance for specific <term> loss functions </term> .
The <term> probabilistic model </term> used in the <term> alignment </term> directly models the <term> link decisions </term> .
Two <term> hardness </term> results for the class <term> NP </term> are reported , along with an <term> exponential time lower-bound </term> for certain classes of <term> algorithms </term> that are currently used in the literature .
We incorporate this analysis into a <term> diagnostic tool </term> intended for <term> developers </term> of <term> machine translation systems </term> , and demonstrate how our application can be used by <term> developers </term> to explore <term> patterns </term> in <term> machine translation output </term> .
Yet , they are scarcely used for the assessment of <term> language pairs </term> like <term> English-Chinese </term> or <term> English-Japanese </term> , because of the <term> word segmentation problem </term> .
<term> STTK </term> , a <term> statistical machine translation tool kit </term> , will be introduced and used to build a working <term> translation system </term> .
<term> STTK </term> has been developed by the presenter and co-workers over a number of years and is currently used as the basis of <term> CMU 's SMT system </term> .
hide detail