|
the
<term>
error-correction rules
</term>
.
|
Our
|
<term>
algorithm
</term>
reported more than
|
#1277
The paper also proposes rule-reduction algorithm applying mutual information to reduce the error-correction rules. Our algorithm reported more than 99% accuracy in both language identification and key prediction. |
|
have in their
<term>
sense coverage
</term>
.
|
Our
|
analysis also highlights the importance
|
#4915
On a subset of the most difficult SENSEVAL-2 nouns, the accuracy difference between the two approaches is only 14.0%, and the difference could narrow further to 6.5% if we disregard the advantage that manually sense-tagged data have in their sense coverage. Our analysis also highlights the importance of the issue of domain dependence in evaluating WSD programs. |
|
's text as a coherent
<term>
corpus
</term>
.
|
Our
|
approach is based on the idea that one
|
#6139
This paper proposes a new methodology to improve the accuracy of a term aggregation system using each author's text as a coherent corpus. Our approach is based on the idea that one person tends to use one expression for one meaning. |
|
</term>
of the same
<term>
source text
</term>
.
|
Our
|
approach yields
<term>
phrasal and single
|
#1800
We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. Our approach yields phrasal and single word lexical paraphrases as well as syntactic paraphrases. |
|
of
<term>
unsupervised WSD systems
</term>
.
|
Our
|
<term>
combination methods
</term>
rely on
<term>
|
#11009
We investigate several voting- and arbiter-based combination strategies over a diverse pool of unsupervised WSD systems. Our combination methods rely on predominant senses which are derived automatically from raw text. |
|
</term>
with
<term>
text understanding
</term>
.
|
Our
|
<term>
document understanding technology
</term>
|
#21388
Because of the complexity of documents and the variety of applications which must be supported, document understanding requires the integration of image understanding with text understanding. Our document understanding technology is implemented in a system called IDUS (Intelligent Document Understanding System), which creates the data for a text retrieval application and the automatic generation of hypertext links. |
|
outperform
<term>
word-based models
</term>
.
|
Our
|
empirical results , which hold for all
|
#2588
Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models outperform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. |
|
simple
<term>
information retrieval
</term>
.
|
Our
|
evaluation shows that our
<term>
filtering
|
#5480
We tested the clustering and filtering processes on electronic newsgroup discussions, and evaluated their performance by means of two experiments: coarse-level clustering and simple information retrieval. Our evaluation shows that our filtering mechanism has a significant positive effect on both tasks. |
|
break down ,
<term>
communication
</term>
.
|
Our
|
goal is to recognize and isolate such
<term>
|
#14498
Such mistakes can slow, and possibly break down, communication. Our goal is to recognize and isolate such miscommunications and circumvent them. |
|
embedded within
<term>
disjunctions
</term>
.
|
Our
|
interpretation differs from that of Pereira
|
#14751
This semantics for feature structures extends the ideas of Pereira and Shieber [11], by providing an interpretation for values which are specified by disjunctions and path values embedded within disjunctions. Our interpretation differs from that of Pereira and Shieber by using a logical model in place of a denotational semantics. |
|
SENSEVAL-2 English lexical sample task
</term>
.
|
Our
|
investigation reveals that this
<term>
method
|
#4856
In this paper, we evaluate an approach to automatically acquire sense-tagged training data from English-Chinese parallel corpora, which are then used for disambiguating the nouns in the SENSEVAL-2 English lexical sample task. Our investigation reveals that this method of acquiring sense-tagged data is promising. |
|
<term>
Chomsky 's minimalist program
</term>
.
|
Our
|
<term>
logical definition
</term>
leads to
|
#1946
We provide a logical definition of Minimalist grammars, that are Stabler's formalization of Chomsky's minimalist program. Our logical definition leads to a neat relation to categorial grammar, (yielding a treatment of Montague semantics), a parsing-as-deduction in a resource sensitive logic, and a learning algorithm from structured data (based on a typing-algorithm and type-unification). |
|
</term>
as an
<term>
edit operation
</term>
.
|
Our
|
<term>
measure
</term>
can be exactly calculated
|
#10375
In this paper, we will present a new evaluation measure which explicitly models block reordering as an edit operation. Our measure can be exactly calculated in quadratic time. |
|
occurrences of a
<term>
morpheme
</term>
) .
|
Our
|
method is seeded by a small
<term>
manually
|
#4638
We approximate Arabic's rich morphology by a model that a word consists of a sequence of morphemes in the pattern prefix*-stem-suffix* (* denotes zero or more occurrences of a morpheme). Our method is seeded by a small manually segmented Arabic corpus and uses it to bootstrap an unsupervised algorithm to build the Arabic word segmenter from a large unsegmented Arabic corpus. |
|
</term>
and
<term>
Text Summarisation
</term>
.
|
Our
|
method takes advantage of the different
|
#6958
Topic signatures can be useful in a number of Natural Language Processing (NLP) applications, such as Word Sense Disambiguation (WSD) and Text Summarisation. Our method takes advantage of the different way in which word senses are lexicalised in English and Chinese, and also exploits the large amount of Chinese text available in corpora and on the Web. |
|
it is often computationally inefficient .
|
Our
|
<term>
model
</term>
allows a careful examination
|
#14805
Unification is attractive, because of its generality, but it is often computationally inefficient. Our model allows a careful examination of the computational complexity of unification. |
|
complex
<term>
linguistic databases
</term>
.
|
Our
|
most important task in building the
<term>
|
#17293
If we want valuable lexicons and grammars to achieve complex natural language processing, we must provide very powerful tools to help create and ensure the validity of such complex linguistic databases. Our most important task in building the editor was to define a set of coherence rules that could be computationally applied to ensure the validity of lexical entries. |
|
to be
<term>
synonymous expressions
</term>
.
|
Our
|
proposed method improves the
<term>
accuracy
|
#6183
According to our assumption, most of the words with similar context features in each author's corpus tend not to be synonymous expressions. Our proposed method improves the accuracy of our term aggregation system, showing that our approach is successful. |
|
</term>
that needs
<term>
affix removal
</term>
.
|
Our
|
<term>
resource-frugal approach
</term>
results
|
#4532
Examples and results will be given for Arabic, but the approach is applicable to any language that needs affix removal. Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietary Arabic stemmer built using rules, affix lists, and human annotated text, in addition to an unsupervised component. |
|
Chinese-to-English translation task
</term>
.
|
Our
|
results show that
<term>
MBR decoding
</term>
|
#6626
We report the performance of the MBR decoders on a Chinese-to-English translation task. Our results show that MBR decoding can be used to tune statistical MT performance for specific loss functions. |