Creates standard for PreTrained behavior #2360

zachgk · 2023-02-01T20:43:35Z

This changes the standard for DJL behavior with preTrained blocks. As of now, they should also start out with frozen parameters. This has been applied to the embeddings.

It was previously applied only to PyTorch, but as of now applies to all models. However, I did leave a carveout for models. It adds a boolean "wasLoaded" so that if you load a model and then create a Trainer directly from it, it will not be frozen. If you load a model and then append some new layers to it (as we have several examples of), then it will need to be unfrozen to retrain.

fixes #2351

This changes the standard for DJL behavior with preTrained blocks. As of now, they should also start out with frozen parameters. This has been applied to the embeddings. It was previously applied only to PyTorch, but as of now applies to all models. However, I did leave a carveout for models. It adds a boolean "wasLoaded" so that if you load a model and then create a Trainer directly from it, it will not be frozen. If you load a model and then append some new layers to it (as we have several examples of), then it will need to be unfrozen.

codecov-commenter · 2023-02-01T21:01:25Z

Codecov Report

Base: 72.08% // Head: 74.37% // Increases project coverage by +2.28% 🎉

Coverage data is based on head (f118ce9) compared to base (bb5073f).
Patch coverage: 74.70% of modified lines in pull request are covered.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files

@@             Coverage Diff              @@
##             master    #2360      +/-   ##
============================================
+ Coverage     72.08%   74.37%   +2.28%     
- Complexity     5126     6817    +1691     
============================================
  Files           473      670     +197     
  Lines         21970    29599    +7629     
  Branches       2351     3073     +722     
============================================
+ Hits          15838    22013    +6175     
- Misses         4925     6086    +1161     
- Partials       1207     1500     +293

Impacted Files	Coverage Δ
api/src/main/java/ai/djl/modality/cv/Image.java	`69.23% <ø> (-4.11%)`	⬇️
...rc/main/java/ai/djl/modality/cv/MultiBoxPrior.java	`76.00% <ø> (ø)`
...rc/main/java/ai/djl/modality/cv/output/Joints.java	`71.42% <ø> (ø)`
.../main/java/ai/djl/modality/cv/output/Landmark.java	`100.00% <ø> (ø)`
...main/java/ai/djl/modality/cv/output/Rectangle.java	`72.41% <0.00%> (ø)`
...i/djl/modality/cv/translator/BigGANTranslator.java	`21.42% <0.00%> (-5.24%)`	⬇️
.../modality/cv/translator/ImageFeatureExtractor.java	`0.00% <0.00%> (ø)`
.../ai/djl/modality/cv/translator/YoloTranslator.java	`27.77% <0.00%> (+18.95%)`	⬆️
...ain/java/ai/djl/modality/cv/util/NDImageUtils.java	`67.10% <0.00%> (+7.89%)`	⬆️
api/src/main/java/ai/djl/modality/nlp/Decoder.java	`63.63% <ø> (ø)`
... and 616 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

In deepjavalibrary#2360, the behavior of using pre-trained models was to freeze parameters. However freezing the parameters on MXNet seems to cause a significant performance regression for training. This removes those changes for a temporary workaround until a deeper investigation can take place.

* Remove performance issues from freezing MXNet In #2360, the behavior of using pre-trained models was to freeze parameters. However freezing the parameters on MXNet seems to cause a significant performance regression for training. This removes those changes for a temporary workaround until a deeper investigation can take place. Co-authored-by: Frank Liu <frankfliu2000@gmail.com>

zachgk requested a review from frankfliu as a code owner February 1, 2023 20:43

zachgk mentioned this pull request Feb 1, 2023

ai.djl.nn.core.Embedding embedding matrix changes during optimization #2351

Closed

frankfliu approved these changes Feb 2, 2023

View reviewed changes

zachgk merged commit cba412d into deepjavalibrary:master Feb 3, 2023

zachgk deleted the preTrained branch February 3, 2023 17:58

zachgk mentioned this pull request Feb 14, 2023

Remove performance issues from freezing MXNet #2394

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Creates standard for PreTrained behavior #2360

Creates standard for PreTrained behavior #2360

zachgk commented Feb 1, 2023

codecov-commenter commented Feb 1, 2023

Creates standard for PreTrained behavior #2360

Creates standard for PreTrained behavior #2360

Conversation

zachgk commented Feb 1, 2023

codecov-commenter commented Feb 1, 2023

Codecov Report