#3244We present a syntax-based constraint for word alignment, known as the cohesion constraint. It requires disjoint English phrases to be mapped to non-overlapping intervals in the French sentence.
inflow of multilingual , multimedia data .
It
gives users the ability to spend their
#3605The TAP-XL Automated Analyst's Assistant is an application designed to help an English-speaking analyst write a topical report, culling information from a large inflow of multilingual, multimedia data. It gives users the ability to spend their time finding more data relevant to their task, and gives them translingual reach into other languages by leveraging human language technology.
central to our
<term>
IE paradigm
</term>
.
It
is based on : ( 1 ) an extended set of
<term>
#3751We also introduce a new way of automatically identifying predicate argument structures, which is central to our IE paradigm. It is based on: (1) an extended set of features; and (2) inductive decision tree learning.
</term>
from
<term>
Japanese news texts
</term>
.
It
is found that the
<term>
Bayesian approach
#5405Comparison is made against non Bayesian summarizers, using test data from Japanese news texts. It is found that the Bayesian approach generally leverages performance of a summarizer, at times giving it a significant lead over non-Bayesian models.
basis of
<term>
CMU 's SMT system
</term>
.
It
has also successfully been coupled with
#6918STTK has been developed by the presenter and co-workers over a number of years and is currently used as the basis of CMU's SMT system. It has also successfully been coupled with rule-based and example based machine translation modules to build a multi engine machine translation system.
and
<term>
special domain
</term>
of HK laws .
It
is particularly valuable to
<term>
empirical
#7334The resultant bilingual corpus, 10.4M English words and 18.3M Chinese characters, is an authoritative and comprehensive text collection covering the specific and special domain of HK laws. It is particularly valuable to empirical MT research.
intended message of an information graphic .
It
then presents an implemented
<term>
graphic
#8693This paper presents a corpus study that explores the extent to which captions contribute to recognizing the intended message of an information graphic. It then presents an implemented graphic interpretation system that takes into account a variety of communicative signals, and an evaluation study showing that evidence obtained from shallow processing of the graphic's caption has a significant impact on the system's success.
a
<term>
robust statistical parser
</term>
.
It
uses a powerful
<term>
pattern-matching language
#10320The system incorporates a decision-tree classifier for 30 SCF types which tests for the presence of grammatical relations (GRs) in the output of a robust statistical parser. It uses a powerful pattern-matching language to classify GRs into frames hierarchically in a way that mirrors inheritance-based lexica.
materials for
<term>
vocabulary learning
</term>
.
It
enables us to select a concise set of reading
#10768We propose a method of organizing reading materials for vocabulary learning. It enables us to select a concise set of reading texts (from a target corpus) that contains all the target vocabulary to be learned.
induction ( WSI )
</term>
is introduced .
It
represents an instantiation of the
<term>
#11043In this paper a novel solution to automatic and unsupervised word sense induction (WSI) is introduced. It represents an instantiation of the one sense per collocation observation (Gale et al., 1992).
of 96 % and a
<term>
recall
</term>
of 98 % .
It
also gets a
<term>
precision
</term>
of 70
#12198After several experiments, and trained with a little corpus of 100,000 words, the system guesses correctly not placing commas with a precision of 96% and a recall of 98%. It also gets a precision of 70% and a recall of 49% in the task of placing commas.
features
</term>
from the
<term>
contexts
</term>
.
It
works by calculating
<term>
eigenvectors
</term>
#12279This paper presents an unsupervised learning approach to disambiguate various relations between named entities by use of various lexical and syntactic features from the contexts. It works by calculating eigenvectors of an adjacency graph's Laplacian to recover a submanifold of data from a high dimensionality space and then performing cluster number estimation on the eigenvectors.
<term>
implicit intention component
</term>
.
It
is argued that the method reduces
<term>
#13435Each generalized metaphor contains a recognition network, a basic mapping, additional transfer mappings, and an implicit intention component. It is argued that the method reduces metaphor interpretation from a reconstruction to a recognition task.
</term>
for
<term>
context-free grammars
</term>
.
It
is argued that the resulting
<term>
algorithm
#14095This paper proposes a series of modifications to the left corner parsing algorithm for context-free grammars. It is argued that the resulting algorithm is both efficient and flexible and is, therefore, a good choice for the parser used in a natural language interface.
( =
<term>
aspectual information
</term>
) .
It
will be demonstrated in this paper that
#17217The verb forms are often claimed to convey two kinds of information : 1. whether the event described in a sentence is present, past or future (= deictic information) 2. whether the event described in a sentence is presented as completed, going on, just starting or being finished (= aspectual information). It will be demonstrated in this paper that one has to add a third component to the analysis of verb form meanings, namely whether or not they express habituality.
modeling
</term>
in such
<term>
systems
</term>
.
It
begins with a characterization of what
#18896This paper explores the role of user modeling in such systems. It begins with a characterization of what a user model is and how it can be used.
implemented for a fragment at the IMS .
It
is based on the
<term>
theory of tenses
</term>
#19086A proposal to deal with French tenses in the framework of Discourse Representation Theory is presented, as it has been implemented for a fragment at the IMS. It is based on the theory of tenses of H. Kamp and Ch. Rohrer.
TAGs
</term>
has been has been developed .
It
can be adapted to take advantage of the
#19731An Earley-type parser for TAGs has been has been developed. It can be adapted to take advantage of the two steps parsing strategy.
a simple
<term>
correctness proof
</term>
.
It
is presented as a
<term>
generalization
</term>
#21471A purely functional implementation of LR-parsers is given, together with a simple correctness proof. It is presented as a generalization of the recursive descent parser.
<term>
parametrized deduction process
</term>
.
It
will be shown that this view supports flexible
#21643The main feature of this model is to view parsing and generation as two strongly interleaved tasks performed by a single parametrized deduction process. It will be shown that this view supports flexible and efficient natural language processing.