DL4J: Add Padam adaptive gradient updater #6253

AlexDBlack · 2018-08-23T00:33:57Z

The Padam updater was recently described here: https://arxiv.org/pdf/1806.06763.pdf

It is an extension of Adam/AMSGrad that claims improved performance (accuracy) like SGD while still maintaining high convergence rates of Adam/AMSGrad. Mathematically, it's basically a blending of SGD and AMSGrad.

Implementing this isn't a high priority for the core team. If anyone wants to tackle this, there are configuration and implementation classes here (we'll need one of each for Padam):
https://github.com/deeplearning4j/deeplearning4j/tree/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/learning/config
https://github.com/deeplearning4j/deeplearning4j/tree/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/learning

stolsvik · 2019-01-26T10:00:09Z

For some extra reference: https://github.com/deeplearning4j/deeplearning4j/issues/5843#issuecomment-414965829 and some few comments downstream.

achalagarwal · 2019-03-23T12:05:43Z

I'd like to give this a shot! As it's my first time contributing to DL4J, do you have any advice/suggestions for me?

achalagarwal · 2019-03-24T03:07:08Z

@AlexDBlack

Are we sure about creating new classes for Padam?
As it's just a slight modification over Amsgrad, will it not be better if we provide support for Padam via AmsGrad by simply adding additional fields wherever required?

saudet · 2019-03-24T07:41:03Z

We can extend AmsGrad, of course. That's what OOP is for :)

AlexDBlack · 2019-03-25T00:11:43Z

Yeah, I'm ok with a separate class (extending AMSGrad if that makes sense).
Though we might end up with a little redundancy, I think I'd prefer a dedicated class for it for usability reasons - i.e., it'll be easier to find as a dedicated class rather than as an option in AMSGrad.

achalagarwal · 2019-03-29T05:53:59Z

@AlexDBlack

Hi, I added the required classes. It is safe to merge:
Unified Commit

@saudet

I did not add the predicate for the range of param, instead logged a warning. Will need your help with predicates.

Haven't requested a pull as I haven't tested the code yet. Couldn't build the project (tried a lot of things) using IntelliJ on Macos. Can someone point me to a thorough readme/guide for the same?

saudet · 2019-03-29T06:34:32Z

@achalagarwal It's name is "Preconditions" actually, just do something like this:

Preconditions.checkArgument(bias != null, "LayerNorm: Use constructor without bias argument if bias is null / not available.");

https://github.com/deeplearning4j/deeplearning4j/blob/master/nd4j/nd4j-backends/nd4j-api-parent/nd4j-api/src/main/java/org/nd4j/linalg/api/ops/impl/transforms/custom/LayerNorm.java#L47

Use Maven on the command line with mvn clean install -Dmaven.test.skip before trying it in an IDE.

achalagarwal · 2019-04-02T05:27:24Z

The build was successful but I had to skip a couple of projects due to network issues (HTTP requests failed)

On Ubuntu:
mvn clean install -Dmaven.test.skip -pl '!:deeplearning4j-dataimport-solrj, !:deeplearning4j-modelexport-solr

@AlexDBlack

Now, how do you suggest I validate the correctness of Padam? Do you want me to build a model and replicate results from a publication? This will take a lot of time. Are there relevant tests for the linalg/learning modules? I could not find any.

Commit

cc: @saudet

AlexDBlack · 2019-04-02T09:04:17Z

@achalagarwal we have updater tests here, adding to that would be good:
https://github.com/deeplearning4j/deeplearning4j/blob/master/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/nn/updater/TestUpdaters.java

We'll carefully review the implementation too once you've opened a pull request. That should be good enough I think.

AlexDBlack added Enhancement help wanted labels Aug 23, 2018

RobAltena added the good first issue label Aug 23, 2018

eclipse / deeplearning4j

DL4J: Add Padam adaptive gradient updater #6253

DL4J: Add Padam adaptive gradient updater #6253

AlexDBlack commented Aug 23, 2018

stolsvik commented Jan 26, 2019

achalagarwal commented Mar 23, 2019

achalagarwal commented Mar 24, 2019

saudet commented Mar 24, 2019

AlexDBlack commented Mar 25, 2019

achalagarwal commented Mar 29, 2019

saudet commented Mar 29, 2019

achalagarwal commented Apr 2, 2019 •

edited

AlexDBlack commented Apr 2, 2019

eclipse / deeplearning4j

DL4J: Add Padam adaptive gradient updater #6253

DL4J: Add Padam adaptive gradient updater #6253

Comments

AlexDBlack commented Aug 23, 2018

stolsvik commented Jan 26, 2019

achalagarwal commented Mar 23, 2019

achalagarwal commented Mar 24, 2019

saudet commented Mar 24, 2019

AlexDBlack commented Mar 25, 2019

achalagarwal commented Mar 29, 2019

saudet commented Mar 29, 2019

achalagarwal commented Apr 2, 2019 • edited

AlexDBlack commented Apr 2, 2019

achalagarwal commented Apr 2, 2019 •

edited