ACL RD-TEC 1.0 Summarization of P05-1054
Paper Title:
A QUANTITATIVE ANALYSIS OF LEXICAL DIFFERENCES BETWEEN GENDERS IN TELEPHONE CONVERSATIONS
A QUANTITATIVE ANALYSIS OF LEXICAL DIFFERENCES BETWEEN GENDERS IN TELEPHONE CONVERSATIONS
Authors: Constantinos Boulis and Mari Ostendorf
Primarily assigned technology terms:
- automatic speech recognition
- binary classification
- categorization
- classification
- classification method
- classifier
- classifiers
- computational linguistics
- data preparation
- feature selection
- feature selection mechanism
- feature weighting
- gender classification
- gender discrimination
- kneser-ney smoothing
- language model training
- language processing
- language production
- learning
- learning method
- learning methods
- learning techniques
- machine learning
- machine learning methods
- machine learning techniques
- machine translation
- maximum entropy
- measuring
- model training
- modeling
- multivariate analysis
- naive bayes
- natural language processing
- processing
- quantitative analysis
- ranking
- recognition
- selection mechanism
- smoothing
- speech recognition
- support vector machines
- text categorization
- text classification
- topic classification
- topic classifier
- vector representation
- voting
- weighting
Other assigned terms:
- american english
- approach
- association for computational linguistics
- bias
- bigram
- case
- characters
- classification accuracy
- classification performance
- confusion matrix
- convergence
- conversation
- conversational speech
- dialog
- dimensionality
- distribution
- document
- document frequency
- electrical engineering
- entropy
- experimental results
- fact
- feature
- hypothesis
- inferences
- information gain
- intention
- inverse document frequency
- knowledge
- language model
- language models
- language processing task
- language processing tasks
- large corpus
- linguistic
- linguistics
- measure
- method
- multinomial model
- names
- natural language
- natural language processing tasks
- pauses
- perplexity
- probability
- probability estimates
- processing tasks
- psycholinguistics
- punctuation
- punctuation marks
- sociolinguistics
- speech corpus
- standard deviation
- support vector
- svms
- term
- terms
- test set
- testing data
- text
- tf \* idf
- theory
- tokens
- toolkit
- topics
- training
- training and testing data
- training data
- transcript
- transcriptions
- transcripts
- turntaking
- understanding
- vocabulary
- weighting scheme
- word
- word fragments
- words