Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DL4J: Add Padam adaptive gradient updater #6253

Open
AlexDBlack opened this issue Aug 23, 2018 · 9 comments
Open

DL4J: Add Padam adaptive gradient updater #6253

AlexDBlack opened this issue Aug 23, 2018 · 9 comments

Comments

@AlexDBlack
Copy link
Contributor

@AlexDBlack AlexDBlack commented Aug 23, 2018

The Padam updater was recently described here: https://arxiv.org/pdf/1806.06763.pdf

It is an extension of Adam/AMSGrad that claims improved performance (accuracy) like SGD while still maintaining high convergence rates of Adam/AMSGrad. Mathematically, it's basically a blending of SGD and AMSGrad.

Implementing this isn't a high priority for the core team. If anyone wants to tackle this, there are configuration and implementation classes here (we'll need one of each for Padam):
https://github.com/deeplearning4j/deeplearning4j/tree/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/learning/config
https://github.com/deeplearning4j/deeplearning4j/tree/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/learning

@stolsvik
Copy link

@stolsvik stolsvik commented Jan 26, 2019

For some extra reference: https://github.com/deeplearning4j/deeplearning4j/issues/5843#issuecomment-414965829 and some few comments downstream.

@achalagarwal
Copy link

@achalagarwal achalagarwal commented Mar 23, 2019

I'd like to give this a shot! As it's my first time contributing to DL4J, do you have any advice/suggestions for me?

@achalagarwal
Copy link

@achalagarwal achalagarwal commented Mar 24, 2019

@AlexDBlack

Are we sure about creating new classes for Padam?
As it's just a slight modification over Amsgrad, will it not be better if we provide support for Padam via AmsGrad by simply adding additional fields wherever required?

@saudet
Copy link
Member

@saudet saudet commented Mar 24, 2019

@AlexDBlack
Copy link
Contributor Author

@AlexDBlack AlexDBlack commented Mar 25, 2019

Yeah, I'm ok with a separate class (extending AMSGrad if that makes sense).
Though we might end up with a little redundancy, I think I'd prefer a dedicated class for it for usability reasons - i.e., it'll be easier to find as a dedicated class rather than as an option in AMSGrad.

@achalagarwal
Copy link

@achalagarwal achalagarwal commented Mar 29, 2019

@AlexDBlack

Hi, I added the required classes. It is safe to merge:
Unified Commit

@saudet

I did not add the predicate for the range of param, instead logged a warning. Will need your help with predicates.

Haven't requested a pull as I haven't tested the code yet. Couldn't build the project (tried a lot of things) using IntelliJ on Macos. Can someone point me to a thorough readme/guide for the same?

@saudet
Copy link
Member

@saudet saudet commented Mar 29, 2019

@achalagarwal It's name is "Preconditions" actually, just do something like this:

Preconditions.checkArgument(bias != null, "LayerNorm: Use constructor without bias argument if bias is null / not available.");

https://github.com/deeplearning4j/deeplearning4j/blob/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/api/ops/impl/transforms/custom/LayerNorm.java#L47

Use Maven on the command line with mvn clean install -Dmaven.test.skip before trying it in an IDE.

@achalagarwal
Copy link

@achalagarwal achalagarwal commented Apr 2, 2019

The build was successful but I had to skip a couple of projects due to network issues (HTTP requests failed)

On Ubuntu:
mvn clean install -Dmaven.test.skip -pl '!:deeplearning4j-dataimport-solrj, !:deeplearning4j-modelexport-solr

@AlexDBlack

Now, how do you suggest I validate the correctness of Padam? Do you want me to build a model and replicate results from a publication? This will take a lot of time. Are there relevant tests for the linalg/learning modules? I could not find any.

Commit

cc: @saudet

@AlexDBlack
Copy link
Contributor Author

@AlexDBlack AlexDBlack commented Apr 2, 2019

@achalagarwal we have updater tests here, adding to that would be good:
https://github.com/deeplearning4j/deeplearning4j/blob/master/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/nn/updater/TestUpdaters.java

We'll carefully review the implementation too once you've opened a pull request. That should be good enough I think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
5 participants