ACL RD-TEC 1.0 Summarization of P06-1011
Paper Title:
EXTRACTING PARALLEL SUB-SENTENTIAL FRAGMENTS FROM NON-PARALLEL CORPORA
EXTRACTING PARALLEL SUB-SENTENTIAL FRAGMENTS FROM NON-PARALLEL CORPORA
Authors: Dragos Stefan Munteanu and Daniel Marcu
Primarily assigned technology terms:
- algorithm
- alignment template model
- bootstrap
- bootstrap resampling
- candidate selection
- chunking
- computational linguistics
- computing
- data acquisition
- detection method
- dynamic programming
- english machine translation
- extraction method
- extraction system
- fragment detection
- fragment extraction
- giza
- information retrieval
- linking
- machine translation
- machine translation system
- machine translation training
- measuring
- mining
- mt system
- mt systems
- parallel sentence detection
- parallel training
- probabilistic translation
- resampling
- search
- sentence alignment
- sentence detection
- sentence detection method
- sentence extraction
- sentence ordering
- smoothing
- smt system
- statistical machine translation
- statistical machine translation system
- statistical mt
- sub-sentence extraction
- translation system
- translation training
- word alignment
- word translation
Other assigned terms:
- alignment models
- alignment procedure
- alignment task
- alignment template
- approach
- association for computational linguistics
- bilingual corpora
- bleu
- bleu score
- case
- chunks
- comparable corpora
- comparable corpus
- conditional probability
- corpora
- development set
- distribution
- document
- events
- fact
- genre
- heuristic
- heuristics
- hypothesis
- implementation
- inflected form
- knowledge
- language models
- language pairs
- lexicon
- lexicon entry
- likelihood
- likelihood-ratio
- linguistics
- meaning
- measure
- method
- methodology
- noise
- non-parallel corpora
- pairs of words
- parallel corpora
- parallel corpus
- parallel sentence
- parallel texts
- parallel training corpus
- parallelism
- phrase
- precision
- probabilistic lexicon
- probabilities
- probability
- probability distribution
- probability distributions
- procedure
- punctuation
- query
- reference translation
- reordering
- segments
- sentence
- sentence level
- sentence pair
- sentences
- signal
- source language
- source sentence
- statistic
- target language
- target languages
- target sentence
- target word
- test data
- text
- tokens
- training
- training corpus
- training data
- translation candidate
- translation lexicon
- translation probabilities
- translational equivalence
- translations
- word
- word alignment task
- word association
- word pair
- words