|
the
<term>
error-correction rules
</term>
.
|
Our
|
<term>
algorithm
</term>
reported more than
|
#1277
The paper also proposes rule-reduction algorithm applying mutual information to reduce the error-correction rules. Our algorithm reported more than 99% accuracy in both language identification and key prediction. |
|
</term>
of the same
<term>
source text
</term>
.
|
Our
|
approach yields
<term>
phrasal and single
|
#1800
We present an unsupervised learning algorithm for identification of paraphrases from a corpus of multiple English translations of the same source text. Our approach yields phrasal and single word lexical paraphrases as well as syntactic paraphrases. |
|
<term>
Chomsky 's minimalist program
</term>
.
|
Our
|
<term>
logical definition
</term>
leads to
|
#1946
We provide a logical definition of Minimalist grammars, that are Stabler's formalization of Chomsky's minimalist program. Our logical definition leads to a neat relation to categorial grammar, (yielding a treatment of Montague semantics), a parsing-as-deduction in a resource sensitive logic, and a learning algorithm from structured data (based on a typing-algorithm and type-unification). |
|
outperform
<term>
word-based models
</term>
.
|
Our
|
empirical results , which hold for all
|
#2588
Within our framework, we carry out a large number of experiments to understand better and explain why phrase-based models outperform word-based models. Our empirical results, which hold for all examined language pairs, suggest that the highest levels of performance can be obtained through relatively simple means: heuristic learning of phrase translations from word-based alignments and lexical weighting of phrase translations. |
|
domain of
<term>
sentence condensation
</term>
.
|
Our
|
<term>
system
</term>
incorporates a
<term>
linguistic
|
#2808
We present an application of ambiguity packing and stochastic disambiguation techniques for Lexical-Functional Grammars (LFG) to the domain of sentence condensation. Our system incorporates a linguistic parser/generator for LFG, a transfer component for parse reduction operating on packed parse forests, and a maximum-entropy model for stochastic output selection. |
|
resolution
</term>
in
<term>
spoken dialogue
</term>
.
|
Our
|
<term>
system
</term>
deals with
<term>
pronouns
|
#3987
We apply a decision tree based approach to pronoun resolution in spoken dialogue. Our system deals with pronouns with NP- and non-NP-antecedents. |
|
</term>
that needs
<term>
affix removal
</term>
.
|
Our
|
<term>
resource-frugal approach
</term>
results
|
#4532
Examples and results will be given for Arabic, but the approach is applicable to any language that needs affix removal. Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietary Arabic stemmer built using rules, affix lists, and human annotated text, in addition to an unsupervised component. |
|
occurrences of a
<term>
morpheme
</term>
) .
|
Our
|
method is seeded by a small
<term>
manually
|
#4638
We approximate Arabic's rich morphology by a model that a word consists of a sequence of morphemes in the pattern prefix*-stem-suffix* (* denotes zero or more occurrences of a morpheme). Our method is seeded by a small manually segmented Arabic corpus and uses it to bootstrap an unsupervised algorithm to build the Arabic word segmenter from a large unsegmented Arabic corpus. |
|
SENSEVAL-2 English lexical sample task
</term>
.
|
Our
|
investigation reveals that this
<term>
method
|
#4856
In this paper, we evaluate an approach to automatically acquire sense-tagged training data from English-Chinese parallel corpora, which are then used for disambiguating the nouns in the SENSEVAL-2 English lexical sample task. Our investigation reveals that this method of acquiring sense-tagged data is promising. |
|
have in their
<term>
sense coverage
</term>
.
|
Our
|
analysis also highlights the importance
|
#4915
On a subset of the most difficult SENSEVAL-2 nouns, the accuracy difference between the two approaches is only 14.0%, and the difference could narrow further to 6.5% if we disregard the advantage that manually sense-tagged data have in their sense coverage. Our analysis also highlights the importance of the issue of domain dependence in evaluating WSD programs. |
|
simple
<term>
information retrieval
</term>
.
|
Our
|
evaluation shows that our
<term>
filtering
|
#5480
We tested the clustering and filtering processes on electronic newsgroup discussions, and evaluated their performance by means of two experiments: coarse-level clustering and simple information retrieval. Our evaluation shows that our filtering mechanism has a significant positive effect on both tasks. |
|
English/Japanese language pairs
</term>
.
|
Our
|
study reveals that the proposed method
|
#5819
We evaluate the proposed methods through several transliteration/back transliteration experiments for English/Chinese and English/Japanese language pairs. Our study reveals that the proposed method not only reduces an extensive system development effort but also improves the transliteration accuracy significantly. |
|
's text as a coherent
<term>
corpus
</term>
.
|
Our
|
approach is based on the idea that one
|
#6139
This paper proposes a new methodology to improve the accuracy of a term aggregation system using each author's text as a coherent corpus. Our approach is based on the idea that one person tends to use one expression for one meaning. |
|
to be
<term>
synonymous expressions
</term>
.
|
Our
|
proposed method improves the
<term>
accuracy
|
#6183
According to our assumption, most of the words with similar context features in each author's corpus tend not to be synonymous expressions. Our proposed method improves the accuracy of our term aggregation system, showing that our approach is successful. |
|
Chinese-to-English translation task
</term>
.
|
Our
|
results show that
<term>
MBR decoding
</term>
|
#6626
We report the performance of the MBR decoders on a Chinese-to-English translation task. Our results show that MBR decoding can be used to tune statistical MT performance for specific loss functions. |
|
</term>
and
<term>
Text Summarisation
</term>
.
|
Our
|
method takes advantage of the different
|
#6958
Topic signatures can be useful in a number of Natural Language Processing (NLP) applications, such as Word Sense Disambiguation (WSD) and Text Summarisation. Our method takes advantage of the different way in which word senses are lexicalised in English and Chinese, and also exploits the large amount of Chinese text available in corpora and on the Web. |
|
non-matches
</term>
in the
<term>
sentence
</term>
.
|
Our
|
results show that
<term>
MT evaluation techniques
|
#8397
We also introduce a novel classification method based on PER which leverages part of speech information of the words contributing to the word matches and non-matches in the sentence. Our results show that MT evaluation techniques are able to produce useful features for paraphrase classification and to a lesser extent entailment. |
|
a lesser extent
<term>
entailment
</term>
.
|
Our
|
<term>
technique
</term>
gives a substantial
|
#8420
Our results show that MT evaluation techniques are able to produce useful features for paraphrase classification and to a lesser extent entailment. Our technique gives a substantial improvement in paraphrase classification accuracy over all of the other models used in the experiments. |
|
themselves , e.g. block bigram features .
|
Our
|
<term>
training algorithm
</term>
can easily
|
#9626
We use a maximum likelihood criterion to train a log-linear block bigram model which uses real-valued features (e.g. a language model score) as well as binary features based on the block identities themselves, e.g. block bigram features. Our training algorithm can easily handle millions of features. |
|
fundamental problems of
<term>
SMT
</term>
.
|
Our
|
work aims at providing useful insights
|
#9989
Over the last decade, a variety of SMT algorithms have been built and empirically tested whereas little is known about the computational complexity of some of the fundamental problems of SMT. Our work aims at providing useful insights into the the computational complexity of those problems. |