W05-1509 likelihood objective function and weight decay regularization ( Bishop , 1995 ) . tion moves
J13-4006 Backpropagation , such as momentum and weight decay regularization , are also used . Momentum makes
J13-4006 thereby speeding convergence . Weight decay regularization is equivalent to a Gaussian prior
W04-0305 for the history representation . Weight decay regularization was applied at the beginning
hide detail