ACL RD-TEC 1.0 Summarization of P05-2023
Paper Title:
AN UNSUPERVISED SYSTEM FOR IDENTIFYING ENGLISH INCLUSIONS IN GERMAN TEXT
AN UNSUPERVISED SYSTEM FOR IDENTIFYING ENGLISH INCLUSIONS IN GERMAN TEXT
Primarily assigned technology terms:
- algorithm
- anaphora resolution
- annotation tool
- automatic classification
- categorisation
- classification
- classifier
- computational linguistics
- cross-validation
- database
- databases
- disambiguation
- entity recognition
- identification
- internet
- language classification
- language identification
- learner
- learning
- learning techniques
- lexicon lookup
- machine learner
- machine learning
- machine learning techniques
- markov model
- morphological analysis
- n-gram-based text categorisation
- named entity recognition
- nlp
- post-processing
- postprocessing
- pre-processing
- random selection
- recognition
- search
- semantic analysis
- sense disambiguation
- statistical approaches
- synthesis
- tagger
- text categorisation
- text-to-speech
- text-to-speech synthesis
- tokenisation
- transducer
- weighting
- word sense disambiguation
- world wide web
- xml markup
Other assigned terms:
- 10-fold cross-validation
- abbreviations
- anaphora
- annotation
- case
- characters
- classification tasks
- compounds
- conditional markov model
- data set
- data sets
- dictionaries
- dutch
- english lexicon
- english web
- f-score
- fact
- feature
- feature set
- feature sets
- foreign words
- gazetteer
- german inflection
- german text
- gold standard
- grammar
- grammar rules
- grammars
- hypothesis
- inflection
- knowledge
- lemma
- lexical database
- lexical items
- lexical resources
- lexicon
- lexicon entries
- likelihood
- linguistic
- linguistic corpus
- linguistic knowledge
- linguistics
- markup
- method
- morphemes
- named entities
- named entity
- names
- natural language
- negra
- newspaper corpus
- nlp applications
- nlp tasks
- nouns
- parts of speech
- precision
- process
- pronunciation
- protein names
- queries
- semantic
- sentences
- synonym
- synthesis quality
- system description
- tags
- technology
- terms
- test data
- text
- tokens
- trained model
- training
- training and test data
- training data
- training data set
- web corpus
- web documents
- web pages
- word
- word sense
- words