|
domain of
<term>
sentence condensation
</term>
.
|
Our
|
<term>
system
</term>
incorporates a
<term>
linguistic
|
#2808
We present an application of ambiguity packing and stochastic disambiguation techniques for Lexical-Functional Grammars (LFG) to the domain of sentence condensation. Our system incorporates a linguistic parser/generator for LFG, a transfer component for parse reduction operating on packed parse forests, and a maximum-entropy model for stochastic output selection. |
|
</term>
with
<term>
text understanding
</term>
.
|
Our
|
<term>
document understanding technology
</term>
|
#21388
Because of the complexity of documents and the variety of applications which must be supported, document understanding requires the integration of image understanding with text understanding. Our document understanding technology is implemented in a system called IDUS (Intelligent Document Understanding System), which creates the data for a text retrieval application and the automatic generation of hypertext links. |
|
English/Japanese language pairs
</term>
.
|
Our
|
study reveals that the proposed method
|
#5819
We evaluate the proposed methods through several transliteration/back transliteration experiments for English/Chinese and English/Japanese language pairs. Our study reveals that the proposed method not only reduces an extensive system development effort but also improves the transliteration accuracy significantly. |
|
</term>
and
<term>
Text Summarisation
</term>
.
|
Our
|
method takes advantage of the different
|
#6958
Topic signatures can be useful in a number of Natural Language Processing (NLP) applications, such as Word Sense Disambiguation (WSD) and Text Summarisation. Our method takes advantage of the different way in which word senses are lexicalised in English and Chinese, and also exploits the large amount of Chinese text available in corpora and on the Web. |
|
embedded within
<term>
disjunctions
</term>
.
|
Our
|
interpretation differs from that of Pereira
|
#14751
This semantics for feature structures extends the ideas of Pereira and Shieber [11], by providing an interpretation for values which are specified by disjunctions and path values embedded within disjunctions. Our interpretation differs from that of Pereira and Shieber by using a logical model in place of a denotational semantics. |
|
non-matches
</term>
in the
<term>
sentence
</term>
.
|
Our
|
results show that
<term>
MT evaluation techniques
|
#8397
We also introduce a novel classification method based on PER which leverages part of speech information of the words contributing to the word matches and non-matches in the sentence. Our results show that MT evaluation techniques are able to produce useful features for paraphrase classification and to a lesser extent entailment. |
|
a lesser extent
<term>
entailment
</term>
.
|
Our
|
<term>
technique
</term>
gives a substantial
|
#8420
Our results show that MT evaluation techniques are able to produce useful features for paraphrase classification and to a lesser extent entailment. Our technique gives a substantial improvement in paraphrase classification accuracy over all of the other models used in the experiments. |
|
</term>
of the same
<term>
source text
</term>
.
|
Our
|
approach yields
<term>
phrasal and single
|
#1800
We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. Our approach yields phrasal and single word lexical paraphrases as well as syntactic paraphrases. |
|
complex
<term>
linguistic databases
</term>
.
|
Our
|
most important task in building the
<term>
|
#17293
If we want valuable lexicons and grammars to achieve complex natural language processing, we must provide very powerful tools to help create and ensure the validity of such complex linguistic databases. Our most important task in building the editor was to define a set of coherence rules that could be computationally applied to ensure the validity of lexical entries. |
|
have in their
<term>
sense coverage
</term>
.
|
Our
|
analysis also highlights the importance
|
#4915
On a subset of the most difficult SENSEVAL-2 nouns, the accuracy difference between the two approaches is only 14.0%, and the difference could narrow further to 6.5% if we disregard the advantage that manually sense-tagged data have in their sense coverage. Our analysis also highlights the importance of the issue of domain dependence in evaluating WSD programs. |
|
outperform
<term>
word-based models
</term>
.
|
Our
|
empirical results , which hold for all
|
#2588
Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models outperform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. |
|
</term>
,
<term>
missing periods
</term>
, etc .
|
Our
|
solution to these problems is to make use
|
#13026
However, a great deal of natural language texts e.g., memos, rough drafts, conversation transcripts etc., have features that differ significantly from neat texts, posing special problems for readers, such as misspelled words, missing words, poor syntactic construction, missing periods, etc. Our solution to these problems is to make use of expectations, based both on knowledge of surface English and on world knowledge of the situation being described. |
|
fundamental problems of
<term>
SMT
</term>
.
|
Our
|
work aims at providing useful insights
|
#9989
Over the last decade, a variety of SMT algorithms have been built and empirically tested whereas little is known about the computational complexity of some of the fundamental problems of SMT. Our work aims at providing useful insights into the the computational complexity of those problems. |
|
<term>
Chomsky 's minimalist program
</term>
.
|
Our
|
<term>
logical definition
</term>
leads to
|
#1946
We provide a logical definition of Minimalist grammars, that are Stabler's formalization of Chomsky's minimalist program. Our logical definition leads to a neat relation to categorial grammar, (yielding a treatment of Montague semantics), a parsing-as-deduction in a resource sensitive logic, and a learning algorithm from structured data (based on a typing-algorithm and type-unification). |
|
the
<term>
error-correction rules
</term>
.
|
Our
|
<term>
algorithm
</term>
reported more than
|
#1277
The paper also proposes rule-reduction algorithm applying mutual information to reduce the error-correction rules. Our algorithm reported more than 99% accuracy in both language identification and key prediction. |
|
break down ,
<term>
communication
</term>
.
|
Our
|
goal is to recognize and isolate such
<term>
|
#14498
Such mistakes can slow, and possibly break down, communication. Our goal is to recognize and isolate such miscommunications and circumvent them. |
|
it is often computationally inefficient .
|
Our
|
<term>
model
</term>
allows a careful examination
|
#14805
Unification is attractive, because of its generality, but it is often computationally inefficient. Our model allows a careful examination of the computational complexity of unification. |
|
themselves , e.g. block bigram features .
|
Our
|
<term>
training algorithm
</term>
can easily
|
#9626
We use a maximum likelihood criterion to train a log-linear block bigram model which uses real-valued features (e.g. a language model score) as well as binary features based on the block identities themselves, e.g. block bigram features. Our training algorithm can easily handle millions of features. |
|
to be
<term>
synonymous expressions
</term>
.
|
Our
|
proposed method improves the
<term>
accuracy
|
#6183
According to our assumption, most of the words with similar context features in each author's corpus tend not to be synonymous expressions. Our proposed method improves the accuracy of our term aggregation system, showing that our approach is successful. |
|
SENSEVAL-2 English lexical sample task
</term>
.
|
Our
|
investigation reveals that this
<term>
method
|
#4856
In this paper, we evaluate an approach to automatically acquire sense-tagged training data from English-Chinese parallel corpora, which are then used for disambiguating the nouns in the SENSEVAL-2 English lexical sample task. Our investigation reveals that this method of acquiring sense-tagged data is promising. |