Dear huggingface community,
I am experimenting with the ViTMAE model from the transformers library. The ViTMAEConfig class has the option "num_channels" to specify the number of input (color) channels belonging to an image. If I modify this, say, to 1 (for processing grayscale images), the model throws an error, due to the number "3" being hard-coded into the functions "patchify" and "unpatchify" in the file "modeling_vit_mae.py".
Motivation
I would like to request to change this such that any number of input channels is possible.
Your contribution
As noted above, one only has to change the functions "patchify" and "depatchify" to either scan for the input color channels, or provide the input color channels as a class variable such that both functions can use it (instead of the hard-coded value "3"). I checked this and on my system it worked out just fine.
The text was updated successfully, but these errors were encountered:
Hey,
thanks for the fast reply. I am a complete beginner to github (I only used it to manage my personal projects so far) and I will not be getting into the PR-procedure anytime soon. I did not know how to reach out to someone managing the vitmae project and tried out this issue just to point things out. If noone else is having the same problem or noone has time to care for this then this is just how it is I guess.
Feature request
Dear huggingface community,
I am experimenting with the ViTMAE model from the transformers library. The ViTMAEConfig class has the option "num_channels" to specify the number of input (color) channels belonging to an image. If I modify this, say, to 1 (for processing grayscale images), the model throws an error, due to the number "3" being hard-coded into the functions "patchify" and "unpatchify" in the file "modeling_vit_mae.py".
Motivation
I would like to request to change this such that any number of input channels is possible.
Your contribution
As noted above, one only has to change the functions "patchify" and "depatchify" to either scan for the input color channels, or provide the input color channels as a class variable such that both functions can use it (instead of the hard-coded value "3"). I checked this and on my system it worked out just fine.
The text was updated successfully, but these errors were encountered: