First-order derivatives: n additional function calls are needed. Second-order derivatives based on gradient calls, when the "grd" module is specified (Dennis and Schnabel 1983): n additional gradient ...
HesScale is built on top of Pytorch and BackPack. It allows for Hessian diagonals to backpropagate through the layers of the network. We appreciate any help to extend HesScale to recurrent neural ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results
Feedback