Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No space left on device for docker-build workflow #5840

Open
3 tasks done
xmfcx opened this issue Mar 5, 2025 · 4 comments
Open
3 tasks done

No space left on device for docker-build workflow #5840

xmfcx opened this issue Mar 5, 2025 · 4 comments
Labels
type:bug Software flaws or errors.

Comments

@xmfcx
Copy link
Contributor

xmfcx commented Mar 5, 2025

Checklist

  • I've read the contribution guidelines.
  • I've searched other issues and no duplicate issues were found.
  • I'm convinced that this is not my fault but a bug.

Description

The workflows are failing with Error: No space left on device.

Filesystem      Size  Used Avail Use% Mounted on
/dev/root        73G   21G   53G  28% /

The available space is plenty.

It fails at an early stage too.

Possible causes

  • Did we add something in a near past?
  • Have the docker build cache been overgrown?

cc. @youtalk @mitsudome-r @oguzkaganozt

@mitsudome-r
Copy link
Member

The only thing that I can think of is CUDA upgrade, but I'm not sure why it started to fail at this timing.

@youtalk
Copy link
Member

youtalk commented Mar 6, 2025

I hope #5830 resolves this issue.

@mitsudome-r
Copy link
Member

I have investigated the size of build-cache with the following commands.

Before:
skopeo inspect --raw docker://ghcr.io/autowarefoundation/autoware-buildcache@sha256:60073907bce4f4fe9e7ce14a1d4d4cb3920a8aefe8089c4396897f1ad190c4fa | jq '[.manifests[].size] | add'
I get about 11.5 GB

Latest:
skopeo inspect --raw docker://ghcr.io/autowarefoundation/autoware-buildcache:amd64-main | jq '[.manifests[].size] | add'
I get about 4.2 GB

@xmfcx
Copy link
Contributor Author

xmfcx commented Mar 7, 2025

So this was caused by an overgrowing cache.

I'm not sure how to prevent it from overgrowing.
But a simple solution would be to remove the cache periodically, maybe weekly to prevent it from going over a size limit.
Or a daily workflow could check its size and remove it once it goes over a size limit.
I think daily checking is a solution until we have other ways to keep its size limited.
And checking its size can be done with the command @mitsudome-r shared above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug Software flaws or errors.
Projects
None yet
Development

No branches or pull requests

3 participants