Feature Request: distinguish the parameters that require grad and that is not for PyTorch models #51

2catycm · 2024-12-06T09:32:11Z

No description provided.

danieldjohnson · 2024-12-15T07:05:33Z

Hi @2catycm, can you clarify what you mean by this?

2catycm · 2025-01-22T09:50:59Z

Hi @2catycm, can you clarify what you mean by this?

Hi, thanks for your reply. Sorry my description was not clear.

In pytorch, some torch.Tensor require grad, which means you want to compute the gradient of this tensor w.r.t. the loss tensor, when calling loss.backward(). We can pass requires_grad=True to the Tensor.__init__, view the attribute via requires_grad, and we can change the attribute via requires_grad_.

In deep learning, usually we need to compute the gradient of all the model parameters w.r.t. the loss, in order to train the model via optimizers like SGD. But in transfer learning and parameter-efficient fine-tuning, it is suggested that not all the model parameters are needed to be modified, some can be freezed in order to preserve the knowledge learned in previous task, in order to prevent catastrophic forgetting. If we train all the parameters, it is called full fine-tuning. If we only train part of the model, for example only the bias(BitFit method), only the LayerNorm layer(LN_Tuning), or add some new modules and only train those modules (like LoRA and Adapter and Prompt Tuning), it is called parameter-efficient fine-tuning.

As models like LLM are very large, fine-tuning the pretrained model on many downstream tasks requires someone to save the modificationof the models many times. If the modification is only partial, then it saves storage a lot.

Then, it becomes a useful feature when we want to visualize the model before training. It is really helpful to see, which parts of the model we prepared for the training recipe, are frozen(don't requires grad and don't requires storage), while which parts of the model are modified (requires grad).

Another Model Visualization library called bigmodelvis supports this feature. That lib shows the model in rich tree format for the console, and assigns different colors to the activated parameters and frozen parameters.

danieldjohnson · 2025-02-17T19:16:55Z

Thanks for the clarification! Added the requires_grad info to the rendered summary of parameters and other torch tensors.

2catycm changed the title ~~Feature Request: distinguish the parameters that requires the grad and that is not for PyTorch~~ Feature Request: distinguish the parameters that require grad and that is not for PyTorch Dec 6, 2024

2catycm changed the title ~~Feature Request: distinguish the parameters that require grad and that is not for PyTorch~~ Feature Request: distinguish the parameters that require grad and that is not for PyTorch models Dec 6, 2024

danieldjohnson added the needs-clarification Further information is requested label Jan 21, 2025

danieldjohnson added feature-request New feature or request and removed needs-clarification Further information is requested labels Feb 9, 2025

danieldjohnson mentioned this issue Feb 17, 2025

Show requires_grad info for torch tensors and parameters. #59

Merged

danieldjohnson closed this as completed in #59 Feb 17, 2025

danieldjohnson closed this as completed in d08cd70 Feb 17, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: distinguish the parameters that require grad and that is not for PyTorch models #51

Feature Request: distinguish the parameters that require grad and that is not for PyTorch models #51

2catycm commented Dec 6, 2024

danieldjohnson commented Dec 15, 2024

2catycm commented Jan 22, 2025 •

edited

Loading

danieldjohnson commented Feb 17, 2025

Feature Request: distinguish the parameters that require grad and that is not for PyTorch models #51

Feature Request: distinguish the parameters that require grad and that is not for PyTorch models #51

Comments

2catycm commented Dec 6, 2024

danieldjohnson commented Dec 15, 2024

2catycm commented Jan 22, 2025 • edited Loading

danieldjohnson commented Feb 17, 2025

2catycm commented Jan 22, 2025 •

edited

Loading