Concordance

Query near-duplicate, detection 17 >
GDEX 17 (0.2 per million)

W15-4922	usually employed for tasks such as	near-duplicate detection	of websites , but can be applied
X98-1007	same or a similar manner . If the	near-duplicate detection	effort is successful , the resulting
X98-1017	Detection . The goal of our work in	near-duplicate detection	is to develop methods for delineating
X98-1007	focuses on high precision IR ,	near-duplicate detection	and context-dependent summarization
P13-1135	existing translation system and used	near-duplicate detection	methods to find candidate parallel
P13-1135	parallel Wikipedia documents by using	near-duplicate detection	, though they did not need to
K15-1013	questions . These tasks include	near-duplicate detection	, paraphrase identification and
D12-1032	papers on authorship attribution ,	near-duplicate detection	, deduplication , record linkage
K15-1013	addressed in this work . Duplicate and	Near-Duplicate Detection	aims to detect exact copies or
X98-1007	high-precision information retrieval ,	near-duplicate detection	, and summarization will be sufficiently
E06-2001	20GB of uncompressed data . 4	Near-duplicate detection	We use a simplified version of
P06-3008	, Word Sense Disambiguation ,	Near-duplicate detection	, bilingual alignment ( e.g.
W11-3603	we used Broder et al. ( 1997 )	near-duplicate detection	algorithm , and store only one
W15-3712	2006 ; Voß et al. , 2009 ) .	Near-duplicate detection	based on metadata is also well
W06-1639	applied the NLP technologies of	near-duplicate detection	and topic-based text categorization
X98-1007	technical paper \ -LSB- 2 \ -RSB- .	NEAR-DUPLICATE DETECTION	The goal of the research in this
X98-1017	documents given to the user . 2 .	Near-Duplicate Detection	. The goal of our work in near-duplicate


	in Help