tech,17-1-H92-1095,bq |
<term>
Language understanding
</term>
work at
<term>
Paramax
</term>
focuses on applying general-purpose
<term>
language understanding technology
</term>
to
<term>
spoken language understanding
</term>
,
<term>
text
understanding
</term>
, and
<term>
document processing
</term>
, integrating
<term>
language understanding
</term>
with
<term>
speech recognition
</term>
,
<term>
knowledge-based information retrieval
</term>
and
<term>
image understanding
</term>
.
|
#19654
Language understanding work at Paramax focuses on applying general-purpose language understanding technology to spoken language understanding,text understanding, and document processing, integrating language understanding with speech recognition, knowledge-based information retrieval and image understanding. |
tech,24-2-H94-1084,bq |
Our
<term>
document understanding technology
</term>
is implemented in a system called
<term>
IDUS ( Intelligent Document Understanding System )
</term>
, which creates the data for a
<term>
text
retrieval application
</term>
and the
<term>
automatic generation of hypertext links
</term>
.
|
#21412
Our document understanding technology is implemented in a system called IDUS (Intelligent Document Understanding System), which creates the data for atext retrieval application and the automatic generation of hypertext links. |
tech,3-1-C04-1116,bq |
We present a
<term>
text
mining method
</term>
for finding
<term>
synonymous expressions
</term>
based on the
<term>
distributional hypothesis
</term>
in a set of coherent
<term>
corpora
</term>
.
|
#6095
We present atext mining method for finding synonymous expressions based on the distributional hypothesis in a set of coherent corpora. |
tech,24-5-P04-2010,bq |
Although the system performs well within a limited textual domain , further research is needed to make it effective for
<term>
open-domain question answering
</term>
and
<term>
text
summarisation
</term>
.
|
#7122
Although the system performs well within a limited textual domain, further research is needed to make it effective for open-domain question answering andtext summarisation. |
tech,21-3-H94-1084,bq |
This paper summarizes the areas of research during
<term>
IDUS
</term>
development where we have found the most benefit from the
<term>
integration
</term>
of
<term>
image and
text
understanding
</term>
.
|
#21446
This paper summarizes the areas of research during IDUS development where we have found the most benefit from the integration of image and text understanding. |
lr,26-6-P03-1050,bq |
Our
<term>
resource-frugal approach
</term>
results in 87.5 %
<term>
agreement
</term>
with a state of the art , proprietary
<term>
Arabic stemmer
</term>
built using
<term>
rules
</term>
,
<term>
affix lists
</term>
, and
<term>
human annotated
text
</term>
, in addition to an
<term>
unsupervised component
</term>
.
|
#4560
Our resource-frugal approach results in 87.5% agreement with a state of the art, proprietary Arabic stemmer built using rules, affix lists, and human annotated text, in addition to an unsupervised component. |
other,26-4-P04-2005,bq |
Our method takes advantage of the different way in which
<term>
word senses
</term>
are lexicalised in
<term>
English
</term>
and
<term>
Chinese
</term>
, and also exploits the large amount of
<term>
Chinese
text
</term>
available in
<term>
corpora
</term>
and on the
<term>
Web
</term>
.
|
#6985
Our method takes advantage of the different way in which word senses are lexicalised in English and Chinese, and also exploits the large amount of Chinese text available in corpora and on the Web. |
other,13-1-P84-1078,bq |
This report describes
<term>
Paul
</term>
, a
<term>
computer text generation system
</term>
designed to create
<term>
cohesive
text
</term>
through the use of
<term>
lexical substitutions
</term>
.
|
#13758
This report describes Paul, a computer text generation system designed to create cohesive text through the use of lexical substitutions. |
lr,20-3-I05-4010,bq |
The resultant
<term>
bilingual corpus
</term>
, 10.4 M
<term>
English words
</term>
and 18.3 M
<term>
Chinese characters
</term>
, is an authoritative and comprehensive
<term>
text
collection
</term>
covering the specific and special domain of HK laws .
|
#8272
The resultant bilingual corpus, 10.4M English words and 18.3M Chinese characters, is an authoritative and comprehensivetext collection covering the specific and special domain of HK laws. |
tech,6-1-P84-1078,bq |
This report describes
<term>
Paul
</term>
, a
<term>
computer
text
generation system
</term>
designed to create
<term>
cohesive text
</term>
through the use of
<term>
lexical substitutions
</term>
.
|
#13751
This report describes Paul, a computer text generation system designed to create cohesive text through the use of lexical substitutions. |
other,24-1-N03-4010,bq |
The
<term>
JAVELIN system
</term>
integrates a flexible ,
<term>
planning-based architecture
</term>
with a variety of
<term>
language processing modules
</term>
to provide an
<term>
open-domain question answering capability
</term>
on
<term>
free
text
</term>
.
|
#3660
The JAVELIN system integrates a flexible, planning-based architecture with a variety of language processing modules to provide an open-domain question answering capability on free text. |
lr,19-2-N03-4010,bq |
The demonstration will focus on how
<term>
JAVELIN
</term>
processes
<term>
questions
</term>
and retrieves the most likely
<term>
answer candidates
</term>
from the given
<term>
text
corpus
</term>
.
|
#3681
The demonstration will focus on how JAVELIN processes questions and retrieves the most likely answer candidates from the giventext corpus. |
tech,38-3-H01-1040,bq |
We also report results of a preliminary ,
<term>
qualitative user evaluation
</term>
of the
<term>
system
</term>
, which while broadly positive indicates further work needs to be done on the
<term>
interface
</term>
to make
<term>
users
</term>
aware of the increased potential of
<term>
IE-enhanced
text
browsers
</term>
.
|
#383
We also report results of a preliminary, qualitative user evaluation of the system, which while broadly positive indicates further work needs to be done on the interface to make users aware of the increased potential of IE-enhanced text browsers. |
tech,15-2-N06-4001,bq |
<term>
InfoMagnets
</term>
aims at making
<term>
exploratory corpus analysis
</term>
accessible to researchers who are not experts in
<term>
text
mining
</term>
.
|
#10891
InfoMagnets aims at making exploratory corpus analysis accessible to researchers who are not experts intext mining. |
other,10-2-A88-1001,bq |
<term>
Multimedia answers
</term>
include
<term>
videodisc images
</term>
and heuristically-produced complete
<term>
sentences
</term>
in
<term>
text
</term>
or
<term>
text-to-speech form
</term>
.
|
#14892
Multimedia answers include videodisc images and heuristically-produced complete sentences intext or text-to-speech form. |
other,13-1-P82-1035,bq |
Most large
<term>
text-understanding systems
</term>
have been designed under the assumption that the input
<term>
text
</term>
will be in reasonably neat form , e.g. ,
<term>
newspaper stories
</term>
and other
<term>
edited texts
</term>
.
|
#12957
Most large text-understanding systems have been designed under the assumption that the inputtext will be in reasonably neat form, e.g., newspaper stories and other edited texts. |
lr-prod,15-3-H94-1014,bq |
The models were constructed using a 5K
<term>
vocabulary
</term>
and trained using a 76 million
<term>
word
</term><term>
Wall Street Journal
text
corpus
</term>
.
|
#21260
The models were constructed using a 5K vocabulary and trained using a 76 million word Wall Street Journal text corpus. |
other,35-1-I05-4010,bq |
In this paper we present our recent work on harvesting
<term>
English-Chinese bitexts
</term>
of the laws of Hong Kong from the
<term>
Web
</term>
and aligning them to the
<term>
subparagraph
</term>
level via utilizing the
<term>
numbering system
</term>
in the
<term>
legal
text
hierarchy
</term>
.
|
#8239
In this paper we present our recent work on harvesting English-Chinese bitexts of the laws of Hong Kong from the Web and aligning them to the subparagraph level via utilizing the numbering system in the legal text hierarchy. |
tech,8-1-C90-3072,bq |
<term>
Spelling-checkers
</term>
have become an integral part of most
<term>
text
processing software
</term>
.
|
#16730
Spelling-checkers have become an integral part of mosttext processing software. |
other,11-7-H01-1042,bq |
Subjects were given a set of up to six extracts of
<term>
translated newswire
text
</term>
.
|
#695
Subjects were given a set of up to six extracts of translated newswire text. |