-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An error occurred while updating the network with MSTDP learning rule #488
Comments
Thank you for investigating the error, indeed this is a strange error. Can you elaborate more about the way this code been use? at #487 the error code start with loading pickle file. If that so, can you please try to run the code without pickling the network? Update: |
Thank you for trying to run my code. I run my code by calling Actually, I call main_net = DSQN_RSTDP(args.history_len, action_num, args.learning_rate, device, args.time, args.dt)
target_net = DSQN_RSTDP(args.history_len, action_num, args.learning_rate, device, args.time, args.dt)
agent = DSQN_RSTDP_Agent(eval_env, main_net, writer, train_env, target_net, replay_memory)
agent.learn(args.timestep_num, args.replay_start_size, args.minibatch_size, args.discount_factor, args.update_freq, args.target_net_update_freq, args.eval_freq, args.eval_episode_num, args.eval_epsilon, args.init_epsilon, args.final_epsilon, args.final_epsilon_frame, model_path = os.path.join(MODEL_DIR, f"{args.model}_{args.env_id}_{time_str}.pkl")) Then inside the def learn(self, timestep_num, replay_start_size, minibatch_size, discount_factor, update_freq, target_net_update_freq,
eval_freq, eval_episode_num, eval_epsilon, init_epsilon, final_epsilon, final_epsilon_frame, model_path):
logger.info("Start learning")
self._main_net.train()
self._update_target_net()
self._target_net.eval()
self._explore(replay_start_size, discount_factor)
timestep = 0
max_average_score = numpy.NINF
while timestep < timestep_num:
is_done = False
next_frame_stack = self._train_env.reset()
while not is_done:
frame_stack = next_frame_stack
action = self._select_action(timestep, init_epsilon, final_epsilon, final_epsilon_frame, frame_stack)
(next_frame_stack, reward, is_done, info) = self._train_env.step(action)
mask = 0.0 if is_done else discount_factor
transition = (frame_stack, action, reward, mask, next_frame_stack)
self._replay_memory.append(transition)
if timestep % update_freq == 0:
minibatch = self._replay_memory.sample(minibatch_size)
self._update_main_net(minibatch[0], minibatch[1], minibatch[2], minibatch[3], minibatch[4])
if timestep % target_net_update_freq == 0:
self._update_target_net()
if timestep % eval_freq == 0:
max_average_score = self._evaluate(eval_episode_num, eval_epsilon, timestep, max_average_score, model_path)
timestep += 1
logger.info("Learning is done") The network update occurs in the def _update_main_net(self, frame_stacks, actions, rewards, masks, next_frame_stacks):
y = self._calculate_y(rewards, masks, next_frame_stacks)
q = self._calculate_q(frame_stacks, actions)
self._main_net.update(y, q)
def _calculate_y(self, rewards, masks, next_frame_stacks):
next_states = self._frame_stack_to_state(next_frame_stacks, is_batch = True)
output = self._target_net.predict(next_states)
max_target_q = output.max(dim = 1)[0].unsqueeze(dim = -1)
rewards = torch.tensor(rewards, dtype = torch.float32, device = self._device).unsqueeze(dim = -1)
masks = torch.tensor(masks, dtype = torch.float32, device = self._device).unsqueeze(dim = -1)
y = rewards + masks * max_target_q
return y
def _calculate_q(self, frame_stacks, actions):
states = self._frame_stack_to_state(frame_stacks, is_batch = True)
output = self._main_net.predict(states) # overloaded in `DSQN_RSTDP_Agent()`
actions = torch.tensor(actions, dtype = torch.int64, device = self._device).unsqueeze(dim = -1)
q = output.gather(dim = 1, index = actions)
return q Where I'm not sure if you are running By the way, I also encountered another problem. When I was using the GPU, the program didn't work properly. I repeatedly checked and tried, and found that it only works properly using the CPU. That's why I use The environment I'm using is:
Could you please try it again? |
Hi. It seems we missed a call to the batch reduction() function - which you can select BTW - default function is torch.mean(). I'm still investigating on that. So I changed your batch size to 1. It then worked much longer, but still leads to another dimension mismatch later on when applying MSTDP to the all to all Connection, probably due to a missing 'a_plus' argument in your call to the MSTDP? |
#488 fix dimensions issues with layers with different shape
The last bug has been fixed, it's about neuron layers shapes flattening, in the special case of MSTDP on layers with different # of dims. One open question remains: in the case of simultaneous multiple samples (batch size > 1): |
Thank you very much for your investigation! Looking forward to your progress! |
Thanks. You can update to the latest bindsnet now. Should be fixed. Remember to use a batch of 1 for now. |
I submitted an issue #487 that explained an error I had encountered.
After investigation, I think it is not because the network parameters were set incorrectly. I replied the specific disgussion in #487.
I think this problem may have occurred in the update phase of the network rather than in the forward propagation phase of the network.
My code logic is to separate forward propagation from network updates. That is, I set
self._net.train(mode = False)
before callingself._net.run()
as follows:to make the network behave like:
in PyTorch.
My code works fine for forward propagation without any errors. But when I update network parameters as follows:
I get the error mentioned in #487.
The most critical error message is:
What is the cause of this error?
I really appreciate your help!
The text was updated successfully, but these errors were encountered: