Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix initialization of deconvolution layer parameters in python net_spec interface #3954

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

SvenTwo
Copy link

@SvenTwo SvenTwo commented Apr 6, 2016

Currently, you cannot initialize deconvolution layer parameters in the same way you can for a convolution layer, because the associated _param structure is called convolution_param instead of deconvolution_param.

E.g. the following throws an unknown attribute error:
deconv_layer = L.Deconvolution(bottom_layer, kernel_size=4, stride=2, num_output=25)

Solve the problem by adding the special case to the dictionary mapping layer types to parameters in param_name_dict().

…spec interface.

The deconvolution layer uses the convolution_param subgroup for its parameters, so hardcode that connection into param_name_dict().
@SvenTwo
Copy link
Author

SvenTwo commented Apr 7, 2016

Btw, simply doing convolution_param=dict(...) works of course. But I find it a bit annoying that the conv and deconv layers cannot be used symmetrically.

@seanbell
Copy link

seanbell commented Jun 2, 2016

@shelhamer Since this is causing a reasonable amount of confusion, perhaps it could be merged in?

@longjon
Copy link
Contributor

longjon commented Jul 11, 2016

It's possible to special case things like this, but I'm not sure it's a good idea... it's contrary to the way it's written in the prototxt, and this is not the only case where this happens, and the user has to figure out where these special cases exist, and the code has to be synchronized between different places, and it will become moot (or have to be revisited) if conv/pool params are unified.

That said this is a common and sensible case, so I'm happy to hear other opinions.

@nitnelave
Copy link
Contributor

Where is it written that the deconvolution uses the convolution param? In the prototxt? In that case, it could be nice to inspect it instead of just manipulating the names to create the param names.

@SvenTwo
Copy link
Author

SvenTwo commented Jul 11, 2016

@nitnelave You can look it up in the automatically generated documentation from the source code here: http://caffe.berkeleyvision.org/doxygen/classcaffe_1_1DeconvolutionLayer.html#details

I usually read things directly in the source code because layer-specific caffe documentation still has some room for improvement ;-)

@nitnelave
Copy link
Contributor

@SvenTwo Yeah, I mean, where in the code is it written that Deconvolution uses the convolution param? Is it just in the prototxt? I had a quick look, I didn't find anything relevant.

@SvenTwo
Copy link
Author

SvenTwo commented Jul 11, 2016

@nitnelave It's implicit because the deconv layer class inherits from the convolution layer class, which handles the setup from parameters.

@nitnelave
Copy link
Contributor

Would it be possible to add some code to REGISTER_LAYER_CLASS to have a list of layers accessible somewhere (ultimately to python), with their corresponding parameters where relevant?
The default would be LayerParameter, of course, but we could add a specific parameter name for cases such as this.

@SvenTwo
Copy link
Author

SvenTwo commented Jul 11, 2016

Isn't that a bit overengineered for that one case? I just proposed one line of code change that shortens the definition of deconv layers a bit.

@nitnelave
Copy link
Contributor

I know and I agree with you, that's a good fix for now. But in the future,
you would have to list all the layers that share parameters, which
complicates the definition of a new layer.

So, to be more error resilient, I think we need some registration at the
layer level. Plus, a list of existing layers would be useful in python.

On Mon, Jul 11, 2016, 11:37 Sven notifications@github.com wrote:

Isn't that a bit overengineered for that one case? I just proposed one
line of code change that shortens the definition of deconv layers a bit...


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3954 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AAwn2fdCIOTr_sR7Naq5k9Puz9eJOzqKks5qUo1UgaJpZM4IBP8p
.

@shelhamer shelhamer mentioned this pull request Apr 14, 2017
@melody-rain
Copy link

Any other better implementation for

# convolution param is re-used in deconvolution layer
    result['Deconvolution'] = 'convolution'

?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants