ACL RD-TEC 1.0 Summarization of P01-1058
Paper Title:
EVALUATING CETEMPÚBLICO, A FREE RESOURCE FOR PORTUGUESE
EVALUATING CETEMPÚBLICO, A FREE RESOURCE FOR PORTUGUESE
Authors: Diana Santos and Paulo Rocha
Primarily assigned technology terms:
- algorithm
- author identification
- broadcasting
- classification
- coding
- computational processing
- corpus processing
- identification
- language engineering
- morphological analyser
- name identification
- nlp
- one-to-one mapping
- processing
- regular expression
- searching
- sentence separation
- spelling
- spelling checker
- subject classification
- tokenization
- validation
Other assigned terms:
- analyser
- annotation
- automatic correction
- bias
- british national corpus
- case
- characters
- checker
- chunk
- chunks
- community
- corpora
- culture
- distribution
- error rate
- evaluation data
- fact
- feature
- heuristic
- heuristic rules
- human inspection
- human intervention
- hypotheses
- intention
- large corpus
- linguists
- mapping
- markup
- meaning
- name length
- names
- newspaper corpus
- newspaper language
- nlp community
- norwegian
- nouns
- opinion
- paragraph
- paragraphs
- parallel corpus
- parse
- phrase
- portuguese language
- precision
- process
- proper name
- proper names
- punctuation
- punctuation mark
- queries
- questionnaire
- sentence
- sentence boundary
- sentences
- size of the corpus
- subcorpus
- tags
- test data
- text
- text chunk
- tokens
- treebank
- user
- web page
- web site
- word
- word corpus
- words