The purpose of this repository is to provide a walkthrough for how to add Perforated AI's Perforated Backpropagationtm to your code. When starting a new project first just add the sections from this README. Once they have been added you can run your code and it will give you errors and warnings about if any "customization" coding is actually required for your architecture. The ways to fix these are in customization.md. Additionally the customization README starts describing alternative options to the recommended settings here. After running your pipeline you can view the graph that shows the correlation values and experiment with the other settings in customization.md that may help get better results.
Files in the perforatedai folder are to provide information about functions and variables in the actual repository to ease with the process of usage and testing.
First install perforatedai and safetensors with
pip install perforatedai safetensors
These are all the imports you will need at the top of your main training file. They will be needed in all of your files that call these functions if some of the below ends up being put into other files.
from perforatedai import pb_globals as PBG
from perforatedai import pb_models as PBM
from perforatedai import pb_utils as PBU
There are many different configuration settings you can play with. The full list with detailed descriptions can be found in this API repository under preforatedai/pb_globals.py. However, the following ones are the most important which do not have default values because they should be considered in every project.
# When to switch between Dendrite learning and neuron learning.
PBG.switchMode = PBG.doingHistory
# How many normal epochs to wait for before switching modes, make sure this is higher than your scheduler's patience.
PBG.nEpochsToSwitch = 10
# Same as above for Dendrite epochs
PBG.pEpochsToSwitch = 10
# The default shape of input tensors
PBG.inputDimensions = [-1, 0, -1, -1]
Every switch to Dendrite learning will increase the size of your network. Because of this we recommend first starting with the following setting. This will tell the function to add Dendrites at every epoch and allow you to test how many Perforated AI cycles you will be able to add before running out of memory. This should also be used to ensure that nothing else will go wrong with your configuration quickly rather than running many wasted epochs before finding out. To ensure maximum efficacy the system should be tested up to 3 Dendrites (Cycle 6). However, it is also reasonable to just test with 1 Dendrite if you only want to add a maximum of 1 Dendrite (Cycle 2) due to memory restrictions.
PBG.testingDendriteCapacity = True
A large benefit PAI provides is automatic conversion of networks to work with Dendrite Nodes through the convertNetwork function.
The call to convert network should be done directly after the model is initialized, before cuda and parallel calls. Afterwards an object that helps track the converted modules for perforated backpropagation must be initialized.
model = yourModel()
model = PBU.convertNetwork(model)
PBG.pbTracker.initialize(
doingPB = True, #This can be set to false if you want to do just normal training
saveName=Choose a save name, # Change the save name for different parameter runs
maximizingScore=True, # True for maximizing validation score, false for minimizing validation loss
makingGraphs=True) # True if you want graphs to be saved
Setting maximize score to False can make implementation quicker, but its generally better to look at the actual validation score rather than the raw loss values. Loss can sometimes continue to be reduced as correct outputs are "more" correct without actually reducing the number of incorrect outputs that are wrong. If choosing to minimize loss, a setting that can help mitigate this is lowering PBG.improvementThreshold. The default is 1e-4, but setting it to 0.001 will only count a loss reduction if the current cycle is at least .1% better than the previous cycle.
Your optimizer and scheduler should be set this way instead. Optimizer.step() should be kept where it is, but Scheduler will get stepped inside our code so get rid of your scheduler.step() if you have one. We recommend using ReduceLROnPlateau but any scheduler and optimizer should work.
PBG.pbTracker.setOptimizer(torch.optim.Adam)
PBG.pbTracker.setScheduler(torch.optim.lr_scheduler.ReduceLROnPlateau)
optimArgs = {'params':model.parameters(),'lr':learning_rate}
schedArgs = {'mode':'max', 'patience': 5} #Make sure this is lower than epochs to switch
optimizer, scheduler = PBG.pbTracker.setupOptimizer(model, optimArgs, schedArgs)
Get rid of scheduler.step if there is one. If your scheduler is operating in a way
that it is doing things in other functions other than just a scheduler.step this
can cause problems and you should just not add the scheduler to our system.
We leave this uncommented inside the code block so it is not forgotten.
Training in general can stay exactly as is. But at the end of your training loop if you would like to track the training score as well you can optionally add:
PBG.pbTracker.addExtraScore(trainingScore, 'Train')
If you run testing periodically at the same time when you run validation (this is recommended) You can also call:
PBG.pbTracker.addTestScore(testScore, 'Test Accuracy')
The test score should obviously not be used in the same way as the validation score for early stopping or other decisions. However, by calculating it at each epoch and calling this function the system will automatically keep track of the affiliated test score of the highest validation score during each neuron training iteration. This will create a CSV file (..bestTestScore.csv) for you that neatly keeps track of the parameter counts of each cycle as well as what the test score was at the point of the highest validation score for that Dendrite count. If you do not call this function it will use the validation score when producing this CSV file. This function should be called before addValidationScore.
At the end of your validation loop the following must be called so the tracker knows when to switch between Dendrite learning and normal learning
model, improved, restructured, trainingComplete = PBG.pbTracker.addValidationScore(score,
model, # .module if its a dataParallel
enter the same save name as above)
Then following line should be replaced with whatever is being used to set up the gpu settings, including DataParallel
model.to(device)
if(trainingComplete):
Break the loop or do whatever you need to do once training is over
elif(restructured): if it does get restructured, reset the optimizer with the same block of code you use above.
optimArgs = your args from above
schedArgs = your args from above
optimizer, scheduler = PBG.pbTracker.setupOptimizer(model, optimArgs, schedArgs)
Description of variables:
model - This is the model to continue using. It may be the same model or a restructured model
improved - This is true if the current model has improved upon previous ones and is the
current best. This variable can be used however you wish.
restructured - This is True if the model has been restructured, either by adding new
Dendrites, or by incorporating trained dendrites into the model.
trainingComplete - This is True if the training is completed and further training will
not be performed.
score - This is the validation score you are using to determine if the model is improving.
It can be an actual score like accuracy or the loss value. If you are using a loss value
be sure when you called initialize() you set maximizingScore to False.
If this is called from within a test/validation function you'll need to add the following where the validation step is called
return model, optimizer, scheduler
And then set them all when it is called like this
model, optimizer, scheduler = validate(model, otherArgs)
Additionally make sure all three are being passed into the function because otherwise they won't be defined if the network is not restructured
With this short README you are now set up to try your first experiment. To understand the output files take a look at the output README. Warnings will automatically be generated when problems occur which can be debugged using the customization README. This README also contains some suggestions at the end for optimization which can help make results even better. If there are any actual problems that are occuring that are not being caught and shown to you we have also created a debugging README with some of the errors we have seen and our suggestions about how to track them down. Please check the debugging README first but then feel free to contact us if you'd like help.