ACL RD-TEC 1.0 Summarization of W06-1704

Paper Title:
CUCWEB: A CATALAN CORPUS BUILT FROM THE WEB

Authors: Gemma Boleda and Stefan Bott and Rodrigo Meza and Carlos Castillo and Toni Badia and Vicente López

Other assigned terms:

  • acquisition task
  • ambiguous word
  • annotated corpus
  • annotation
  • approach
  • basque
  • cache
  • case
  • catalan
  • chunks
  • cluster
  • co-occurrence
  • co-occurrence frequency
  • community
  • computational linguists
  • constraint grammar
  • constraint grammar formalism
  • corpora
  • corpus exploitation
  • determiner
  • determiners
  • dictionary
  • distribution
  • document
  • dutch
  • encyclopedia
  • f-score
  • fact
  • feature
  • formalism
  • frame
  • french
  • genre
  • grammar
  • grammar formalism
  • grammars
  • heuristic
  • heuristics
  • implementation
  • index
  • information content
  • lemma
  • lemmata
  • lexical material
  • linguist
  • linguistic
  • linguistic data
  • linguistic filter
  • linguistics
  • linguists
  • main verb
  • manual tagging
  • markup
  • metadata
  • methodology
  • morphological features
  • morphological information
  • multilinguality
  • names
  • nlp community
  • noise
  • nouns
  • pagerank
  • part of speech
  • parts of speech
  • personal pronoun
  • procedure
  • process
  • pronoun
  • pronouns
  • punctuation
  • punctuation marks
  • queries
  • query
  • relative frequency
  • search results
  • seed
  • serbian
  • size of the corpus
  • statistical information
  • statistics
  • structural information
  • suffix
  • syntactic function
  • syntactic functions
  • syntactic information
  • tagset
  • teaching
  • technology
  • terms
  • text
  • text genre
  • translations
  • user
  • verb
  • verb form
  • web corpus
  • web documents
  • web pages
  • web site
  • word
  • word corpus
  • word form
  • word level
  • word strings
  • words

Extracted Section Types:


This page last edited on 10 May 2017.

*** ***