Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error occurred while updating the network with MSTDP learning rule #488

Closed
AptX395 opened this issue May 17, 2021 · 6 comments
Closed

Comments

@AptX395
Copy link

AptX395 commented May 17, 2021

I submitted an issue #487 that explained an error I had encountered.

After investigation, I think it is not because the network parameters were set incorrectly. I replied the specific disgussion in #487.

I think this problem may have occurred in the update phase of the network rather than in the forward propagation phase of the network.

My code logic is to separate forward propagation from network updates. That is, I set self._net.train(mode = False) before calling self._net.run() as follows:

self._net.train(mode = False)
spike = self._encoder(x)    # PoissonEncoder
input = {"input": spike}
self._net.run(input, self._time, one_step = True)

to make the network behave like:

with torch.no_grad():
    model.forward(x)

in PyTorch.


My code works fine for forward propagation without any errors. But when I update network parameters as follows:

self._net.train(mode = True)

for connection in self._net.connections:
    self._net.connections[connection].update(reward = reward)

I get the error mentioned in #487.

The most critical error message is:

  File "/usr/local/anaconda3/envs/snn_rl/lib/python3.7/site-packages/bindsnet/learning/learning.py", line 669, in _conv2d_connection_update
    self.eligibility = self.eligibility.view(self.connection.w.size())
RuntimeError: shape '[32, 4, 8, 8]' is invalid for input of size 262144

What is the cause of this error?

I really appreciate your help!

@Hananel-Hazan
Copy link
Collaborator

Hananel-Hazan commented May 18, 2021

Thank you for investigating the error, indeed this is a strange error.

Can you elaborate more about the way this code been use? at #487 the error code start with loading pickle file.
model_path = os.path.join(MODEL_DIR, f"{args.model}_{args.env_id}_{time_str}.pkl"))

If that so, can you please try to run the code without pickling the network?

Update:
I run the code under your profile https://github.com/AptX395/SNN_RL and it seems running, or at least the TensorBoeard seems updating. I didnt got the error above. Can you please provide a code that I can replicate the error.

@AptX395
Copy link
Author

AptX395 commented May 18, 2021

Thank you for investigating the error, indeed this is a strange error.

Can you elaborate more about the way this code been use? at #487 the error code start with loading pickle file.
model_path = os.path.join(MODEL_DIR, f"{args.model}_{args.env_id}_{time_str}.pkl"))

If that so, can you please try to run the code without pickling the network?

Update:
I run the code under your profile https://github.com/AptX395/SNN_RL and it seems running, or at least the TensorBoeard seems updating. I didnt got the error above. Can you please provide a code that I can replicate the error.

Thank you for trying to run my code.

I run my code by calling agent.learn(), and model_path = os.path.join(MODEL_DIR, f"{args.model}_{args.env_id}_{time_str}.pkl")) is an arguments to agent.learn().

Actually, I call agent.learn() as follows:

main_net = DSQN_RSTDP(args.history_len, action_num, args.learning_rate, device, args.time, args.dt)
target_net = DSQN_RSTDP(args.history_len, action_num, args.learning_rate, device, args.time, args.dt)
agent = DSQN_RSTDP_Agent(eval_env, main_net, writer, train_env, target_net, replay_memory)
agent.learn(args.timestep_num, args.replay_start_size, args.minibatch_size, args.discount_factor, args.update_freq, args.target_net_update_freq, args.eval_freq, args.eval_episode_num, args.eval_epsilon, args.init_epsilon, args.final_epsilon, args.final_epsilon_frame, model_path = os.path.join(MODEL_DIR, f"{args.model}_{args.env_id}_{time_str}.pkl"))

Then inside the agent.learn() is the normal DQN run flow, as follows:

    def learn(self, timestep_num, replay_start_size, minibatch_size, discount_factor, update_freq, target_net_update_freq,
                eval_freq, eval_episode_num, eval_epsilon, init_epsilon, final_epsilon, final_epsilon_frame, model_path):
        logger.info("Start learning")
        self._main_net.train()
        self._update_target_net()
        self._target_net.eval()
        self._explore(replay_start_size, discount_factor)
        timestep = 0
        max_average_score = numpy.NINF

        while timestep < timestep_num:
            is_done = False
            next_frame_stack = self._train_env.reset()

            while not is_done:
                frame_stack = next_frame_stack
                action = self._select_action(timestep, init_epsilon, final_epsilon, final_epsilon_frame, frame_stack)
                (next_frame_stack, reward, is_done, info) = self._train_env.step(action)
                mask = 0.0 if is_done else discount_factor
                transition = (frame_stack, action, reward, mask, next_frame_stack)
                self._replay_memory.append(transition)

                if timestep % update_freq == 0:
                    minibatch = self._replay_memory.sample(minibatch_size)
                    self._update_main_net(minibatch[0], minibatch[1], minibatch[2], minibatch[3], minibatch[4])

                if timestep % target_net_update_freq == 0:
                    self._update_target_net()

                if timestep % eval_freq == 0:
                    max_average_score = self._evaluate(eval_episode_num, eval_epsilon, timestep, max_average_score, model_path)

                timestep += 1

        logger.info("Learning is done")

The network update occurs in the self._update_main_net(), which looks like this:

    def _update_main_net(self, frame_stacks, actions, rewards, masks, next_frame_stacks):
        y = self._calculate_y(rewards, masks, next_frame_stacks)
        q = self._calculate_q(frame_stacks, actions)
        self._main_net.update(y, q)

    def _calculate_y(self, rewards, masks, next_frame_stacks):
        next_states = self._frame_stack_to_state(next_frame_stacks, is_batch = True)
        output = self._target_net.predict(next_states)
        max_target_q = output.max(dim = 1)[0].unsqueeze(dim = -1)
        rewards = torch.tensor(rewards, dtype = torch.float32, device = self._device).unsqueeze(dim = -1)
        masks = torch.tensor(masks, dtype = torch.float32, device = self._device).unsqueeze(dim = -1)
        y = rewards + masks * max_target_q
        return y

    def _calculate_q(self, frame_stacks, actions):
        states = self._frame_stack_to_state(frame_stacks, is_batch = True)
        output = self._main_net.predict(states)    # overloaded in `DSQN_RSTDP_Agent()`
        actions = torch.tensor(actions, dtype = torch.int64, device = self._device).unsqueeze(dim = -1)
        q = output.gather(dim = 1, index = actions)
        return q

Where self._calculate_q() is overloaded in DSQN_RSTDP_Agent().

I'm not sure if you are running DSQN_RSTDP correctly. python run.py --model=DSQN_RSTDP --device=cpu is the command I used.

By the way, I also encountered another problem. When I was using the GPU, the program didn't work properly. I repeatedly checked and tried, and found that it only works properly using the CPU. That's why I use --device=cpu.

The environment I'm using is:

python=3.7.10
pytorch=1.7.1
gym=0.18.0
bindsnet=0.2.8

Could you please try it again?

@SimonInParis
Copy link
Collaborator

Hi.
Just ran your code. Still investigating.

It seems we missed a call to the batch reduction() function - which you can select BTW - default function is torch.mean(). I'm still investigating on that.

So I changed your batch size to 1.

It then worked much longer, but still leads to another dimension mismatch later on when applying MSTDP to the all to all Connection, probably due to a missing 'a_plus' argument in your call to the MSTDP?

Hananel-Hazan added a commit that referenced this issue May 19, 2021
#488 fix dimensions issues with layers with different shape
@SimonInParis
Copy link
Collaborator

The last bug has been fixed, it's about neuron layers shapes flattening, in the special case of MSTDP on layers with different # of dims.

One open question remains: in the case of simultaneous multiple samples (batch size > 1):
How should MSTDP behave? Should it average/sum all incoming/outgoing spikes and compute synapses's eligibility based on that? Or simply reject batches?

@AptX395
Copy link
Author

AptX395 commented May 20, 2021

The last bug has been fixed, it's about neuron layers shapes flattening, in the special case of MSTDP on layers with different # of dims.

One open question remains: in the case of simultaneous multiple samples (batch size > 1):
How should MSTDP behave? Should it average/sum all incoming/outgoing spikes and compute synapses's eligibility based on that? Or simply reject batches?

Thank you very much for your investigation! Looking forward to your progress!

@SimonInParis
Copy link
Collaborator

Thanks. You can update to the latest bindsnet now. Should be fixed. Remember to use a batch of 1 for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants