ACL RD-TEC 1.0 Summarization of W04-1112
Paper Title:
CHINESE TERM EXTRACTION FROM WEB PAGES BASED ON COMPOUND TERM PRODUCTIVITY
CHINESE TERM EXTRACTION FROM WEB PAGES BASED ON COMPOUND TERM PRODUCTIVITY
Authors: Hiroshi Nakagawa and Hiroyuki Kojima and Akira Maeda
Primarily assigned technology terms:
- analyzer
- automatic term recognition
- c-value
- candidate extraction
- chinese language nlp
- chinese term extraction
- chinese web
- extraction method
- extraction procedure
- extraction system
- learning
- machine learning
- morphological analysis
- morphological analyzer
- morphological analyzers
- nlp
- parsing
- pos tagging
- ranking
- ranking method
- recognition
- recognition system
- regular expression
- scoring
- scoring function
- scoring method
- segmentation
- tagging
- term candidate extraction
- term extraction
- term recognition
- term recognition system
- terminology
- terminology extraction
- trigram acquisition
- tuning
- word segmentation
Other assigned terms:
- adjective
- agglutinative language
- alphabet
- ambiguity
- auxiliary verbs
- bias
- bigram
- boundary marker
- candidate terms
- case
- characters
- chinese characters
- chinese language
- chinese nouns
- chinese text
- chinese words
- complex term
- compound words
- concept
- concepts
- content words
- culture
- distribution
- document
- document frequency
- entropy
- explicit word boundary
- fact
- function words
- geometric mean
- gold standard
- inflection
- inverse document frequency
- key words
- knowledge
- language resources
- latin alphabet
- linguistic
- linguistic knowledge
- linguistic resources
- meaning
- meanings
- measure
- measures
- method
- names
- noun phrases
- nouns
- part of speech
- particle
- particles
- parts of speech
- pos sequence
- pos tag
- pos tag sequence
- precision
- prepositions
- procedure
- pronouns
- proper names
- punctuation
- punctuation marks
- relation
- segments
- sentence
- statistic
- statistics
- stop word list
- syntactic structure
- tag sequence
- tags
- target language
- term
- term frequency
- terms
- text
- text corpus
- trigram
- unigram
- verb
- vocabulary
- web page
- web pages
- web site
- word
- word boundary
- word sequence
- word sequences
- word trigram
- words