Memory Consumption Too High #302

avik-pal · 2018-06-18T06:41:11Z

The actual model I used for testing was a ResNet152 model which runs out of memory at around 4 batchsize for Flux. A small repro for the issue would be

using Flux
using CuArrays
model = Chain(Conv((3, 3), 3=>64, relu, pad = (1, 1)),
              Conv((3, 3), 64=>128, relu, pad = (1, 1)),
              x -> maxpool(x, (7, 7)),
              Conv((3, 3), 128=>128, relu, pad = (1, 1)),
              Conv((3, 3), 128=>128, relu, pad = (1, 1)),
              x -> maxpool(x, (7, 7))) |> gpu
x = rand(224,224,3,64) |> gpu;
model(x)

A pytorch equivalent for the above would be

import torch
import torch.nn as nn
import numpy as np
model = nn.Sequential(nn.Conv2d(3, 64, 3, padding = 1),
                      nn.ReLU(),
                      nn.Conv2d(64, 128, 3, padding = 1),
                      nn.ReLU(),
                      nn.MaxPool2d((7, 7)),
                      nn.Conv2d(128, 128, 3, padding = 1),
                      nn.ReLU(),
                      nn.Conv2d(128, 128, 3, padding = 1),
                      nn.ReLU(),
                      nn.MaxPool2d((7, 7)))
model = model.cuda()
x = torch.Tensor(np.random.rand(64,3,224,224)).cuda()
model(x)

The models were tested on a 16GB P100 GPU. Also the pytorch code runs fine a 12GB 1080Ti GPU

ViralBShah · 2018-06-18T11:57:50Z

@maleadt You said on slack that perhaps references are being kept longer than they need to in Julia/Flux, since we do a complete gc scan when we have memory pressure. Copying you here in case my understanding is incorrect.

maleadt · 2018-06-19T07:16:13Z

That is correct, see https://github.com/JuliaGPU/CuArrays.jl/blob/e06ab7cf63bd249ef10a3511cde0df39d1463a05/src/memory.jl#L210-L238

I guess we could add a debug mode that keeps track of the stack trace for every allocation and dumps live ones when when encountering a true OOM situation. I'll have a look once there's 0.7 compatibility.

MikeInnes · 2019-03-26T16:40:31Z

Not sure if this is fixed but the situation has changed significantly (e.g. we have #465 and lots of CuArrays improvements). We can figure out other things as they come up.

MikeInnes closed this as completed Mar 26, 2019

CarloLucibello mentioned this issue Nov 9, 2024

cuda gpu memory usage increasing in time #2523

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Consumption Too High #302

Memory Consumption Too High #302

avik-pal commented Jun 18, 2018 •

edited

Loading

ViralBShah commented Jun 18, 2018

maleadt commented Jun 19, 2018

MikeInnes commented Mar 26, 2019

Memory Consumption Too High #302

Memory Consumption Too High #302

Comments

avik-pal commented Jun 18, 2018 • edited Loading

ViralBShah commented Jun 18, 2018

maleadt commented Jun 19, 2018

MikeInnes commented Mar 26, 2019

avik-pal commented Jun 18, 2018 •

edited

Loading