D15-1040 representation automatically during joint training . The performance results for
D13-1054 parameters with respect to the joint training objective . Given a set of parameters
D10-1037 using a novel EM-based method for joint training . We evaluate our approach on
D15-1040 our model can easily be used for joint training over k > 2 languages . We
D15-1064 embeddings , combined with our joint training objective , provide a large improvement
D14-1017 results using geometric means . The joint training method ( Liang et al. , 2006
D15-1040 This shows the superiority of joint training compared with single language
D13-1074 described in Sec . 4 . We show that joint training produces an even stronger gain
D15-1040 the tagset mapping as part of joint training . Beyond 15k tokens , the joint
D15-1064 performed the best on dev data for joint training . General Results Table 2 shows
D10-1102 second property is desirable since joint training avoids error propagation that
D15-1040 Regularization Parameter Tuning Joint training with a dictionary ( see equation
D10-1019 data set . • c = number of joint training iterations . • cs = number
D14-1015 local and global context via a joint training objective . Much of the research
D15-1053 propose to introduce a degree of joint training of parameters is to incorporate
D10-1019 using virtual nodes , and performs joint training and decoding in the factorized
D15-1121 efforts along these lines is the joint training of the CTM and the log-linear
D15-1064 text . Finally , we propose a joint training objective for the embeddings
D15-1040 future work , we plan to extend joint training to several languages , and further
D10-1019 graph structure that exploits joint training and decoding in the factorized
hide detail