X93-1009 |
itself very nicely to conventional
|
neural network training
|
algorithms . One of those algorithms
|
P14-2023 |
n-best lists with the new weights .
|
Neural Network Training
|
. All neural network models are
|
P14-1129 |
of -- 15x . Table 1 shows the
|
neural network training
|
results with various values of
|
D14-1003 |
Huck et al. , 2013 ) . For the
|
neural network training
|
, we selected a subset of 9M
|
W15-3034 |
using 17 dense fea - tures . The
|
neural network training
|
is performed using the same data
|
D14-1003 |
using 17 dense features . The
|
neural network training
|
was performed using a selection
|
D15-1040 |
representation learning as part of a
|
neural network training
|
. The underlying hypothesis for
|
W04-0305 |
d1 , ... , di − 1 ) . The
|
neural network training
|
methods we use try to find representations
|
P04-1013 |
, dm ) ) , respectively . The
|
neural network training
|
methods we use try to find representations
|
D15-1029 |
Segmenter with default settings . 3.2
|
Neural Network Training
|
Training was performed with an
|
E03-1002 |
which are induced as part of the
|
neural network training
|
process . These induced features
|
P14-1013 |
. There are two phases in our
|
neural network training
|
process : pre-training and fine-tuning
|
P14-1013 |
machine translation task . In
|
neural network training
|
, a large number of monolingual
|
P13-1017 |
neural net - works . Besides that ,
|
neural network training
|
also involves some hyperparameters
|
P14-1129 |
are given in Section 6.5 . 2.2
|
Neural Network Training
|
The training procedure is identical
|
P05-1023 |
finite number of these parameters .
|
Neural network training
|
is applied to determine the values
|
P05-1023 |
( d1 , ... , di − 1 ) .
|
Neural network training
|
tries to find such a history
|
P04-1013 |
This paper has also proposed a
|
neural network training
|
method which optimizes a discriminative
|
W13-4707 |
propagation method is commonly used in
|
neural network training
|
( V. J. Hodge and J. Austin 2003
|
D15-1214 |
inspired by " dropout " as is used in
|
neural network training
|
, where various connections between
|