DL4J: Add Padam adaptive gradient updater #6253
Comments
For some extra reference: https://github.com/deeplearning4j/deeplearning4j/issues/5843#issuecomment-414965829 and some few comments downstream. |
I'd like to give this a shot! As it's my first time contributing to DL4J, do you have any advice/suggestions for me? |
Are we sure about creating new classes for Padam? |
We can extend AmsGrad, of course. That's what OOP is for :)
|
Yeah, I'm ok with a separate class (extending AMSGrad if that makes sense). |
Hi, I added the required classes. It is safe to merge: I did not add the predicate for the range of param, instead logged a warning. Will need your help with predicates. Haven't requested a pull as I haven't tested the code yet. Couldn't build the project (tried a lot of things) using IntelliJ on Macos. Can someone point me to a thorough readme/guide for the same? |
@achalagarwal It's name is "Preconditions" actually, just do something like this:
Use Maven on the command line with |
The build was successful but I had to skip a couple of projects due to network issues (HTTP requests failed) On Ubuntu: Now, how do you suggest I validate the correctness of Padam? Do you want me to build a model and replicate results from a publication? This will take a lot of time. Are there relevant tests for the linalg/learning modules? I could not find any. cc: @saudet |
@achalagarwal we have updater tests here, adding to that would be good: We'll carefully review the implementation too once you've opened a pull request. That should be good enough I think. |
The Padam updater was recently described here: https://arxiv.org/pdf/1806.06763.pdf
It is an extension of Adam/AMSGrad that claims improved performance (accuracy) like SGD while still maintaining high convergence rates of Adam/AMSGrad. Mathematically, it's basically a blending of SGD and AMSGrad.
Implementing this isn't a high priority for the core team. If anyone wants to tackle this, there are configuration and implementation classes here (we'll need one of each for Padam):
https://github.com/deeplearning4j/deeplearning4j/tree/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/learning/config
https://github.com/deeplearning4j/deeplearning4j/tree/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/learning
The text was updated successfully, but these errors were encountered: