Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove temporary files after crash. #43

Open
etiennesky opened this issue Dec 20, 2018 · 4 comments
Open

remove temporary files after crash. #43

etiennesky opened this issue Dec 20, 2018 · 4 comments

Comments

@etiennesky
Copy link
Contributor

Hi,
Pablo from BSC reports that lots of temporary files remain after running ece3-postproc. Some of the folders are quite large, which happens after the scripts crash. It seems that folder created by mktemp are not automatically deleted.

As a workaround, we could put them in a unique folder (e.g. $SCRATCH/tmp_ecearth3/$expid) and delete that folder later on.

Probably a better solution is to create the folder in $TMPDIR instead of $SCRATCH, so when the job finishes the files are deleted. But this has the drawback that in case of an error the files are not around for debugging. Also it might be platform dependent, so we could define the top-level folder in the conf/ files.

But a drawback to this is that the files might not be available for debugging (also platform dependent).

Any thoughts @plesager @pabretonniere @mcastril @aearamos ?

bsc32130@login3:/gpfs/scratch/bsc32/bsc32130/tmp_ecearth3/tmp> du -sh ./*
1.0K    ./ecmean_t034_0qIhHA
1.0K    ./ecmean_t034_5AVzcc
1.0K    ./ecmean_t034_7hlXAh
1.0K    ./ecmean_t034_aBTQwg
1.0K    ./ecmean_t034_eu1TYj
1.0K    ./ecmean_t034_h8JcGP
1.0K    ./ecmean_t034_Kp21yp
1.0K    ./ecmean_t034_NCECNC
1.0K    ./ecmean_t034_nMvzky
1.0K    ./ecmean_t034_tLcSlM
1.0K    ./ecmean_t036_9DaDrU
12M ./ecmean_t036_9Dj2kD
1.0K    ./ecmean_t036_aDGc1c
1.0K    ./ecmean_t036_j2ILmS
1.0K    ./ecmean_t036_K9GEaU
1.0K    ./ecmean_t036_moPODc
1.0K    ./ecmean_t036_PfZi18
1.0K    ./ecmean_t036_QlHJzb
1.0K    ./ecmean_t036_SLHQMb
1.0K    ./ecmean_t036_TMC2Ml
1.0K    ./ecmean_t036_wyegWO
1.0K    ./ecmean_t036_XFszDW
1.0K    ./ecmean_t036_zcbSUr
1.0K    ./ecmean_t037_Kvp64O
1.0K    ./ecmean_t037_PaWzMK
1.0K    ./ecmean_t037_pcoGjx
1.0K    ./ecmean_t037_psVNmT
1.0K    ./ecmean_t037_RhDaIq
1.0K    ./ecmean_t037_rSDgjw
1.0K    ./ecmean_t037_uOMjIr
1.0K    ./ecmean_t037_UyaO23
1.0K    ./ecmean_t037_xKNQ6Z
1.0K    ./ecmean_t038_C2kjvM
1.0K    ./ecmean_t038_gGbr4D
1.0K    ./ecmean_t038_MGIaP6
1.0K    ./ecmean_t038_OcpdnF
1.0K    ./ecmean_t039_2Bu4Lk
1.0K    ./ecmean_t039_3UUYUb
1.0K    ./ecmean_t039_AVLCMI
1.0K    ./ecmean_t039_bNGqsY
1.0K    ./ecmean_t039_eOPqkL
1.0K    ./ecmean_t039_jszxmS
1.0K    ./ecmean_t039_RtxAe4
1.0K    ./ecmean_t039_WF2IaJ
1.0K    ./ecmean_t03b_3CbPFV
1.0K    ./ecmean_t03b_7KPrwv
1.0K    ./ecmean_t03b_8oi9WQ
1.0K    ./ecmean_t03b_G9KKbE
1.0K    ./ecmean_t03b_rqTzpX
1.0K    ./ecmean_t03b_SLhIu2
1.0K    ./ecmean_t03b_UpQaN1
1.0K    ./ecmean_t03d_7otzNM
1.0K    ./ecmean_t03d_bmolP1
1.0K    ./ecmean_t03d_FogxJU
1.0K    ./ecmean_t03d_hbemTS
1.0K    ./ecmean_t03d_ihXyHO
1.0K    ./ecmean_t03d_janpqU
1.0K    ./ecmean_t03d_KzHd1V
1.0K    ./ecmean_t03d_onzYq2
1.0K    ./ecmean_t03d_sc94YQ
1.0K    ./ecmean_t03d_t7RuV2
1.0K    ./ecmean_t03d_yP492b
1.0K    ./ecmean_t03d_Z04SFu
1.0K    ./ecmean_t03w_dWgOGi
1.0K    ./ecmean_t03w_gfiTWZ
1.0K    ./ecmean_t03w_Iklu7s
1.0K    ./ecmean_t03w_lODpQD
1.0K    ./ecmean_t03w_QEGFUP
712M    ./hireclim2_t036_aECpbh
5.9G    ./hireclim2_t04h_Cpy43d
5.9G    ./hireclim2_t04h_DQIRye
1.0K    ./ts_t034_0KOyGt
1.0K    ./ts_t034_0phAMu
1.0K    ./ts_t034_18uqjz
1.0K    ./ts_t034_1tQK9T
1.0K    ./ts_t034_53ye5H
1.0K    ./ts_t034_5Oufv4
1.0K    ./ts_t034_9f7azq
1.0K    ./ts_t034_9MqHz7
1.0K    ./ts_t034_aHdHS6
1.0K    ./ts_t034_Ax1NnU
1.0K    ./ts_t034_BiGQbp
1.0K    ./ts_t034_bWUbzd
1.0K    ./ts_t034_cg4u2K
1.0K    ./ts_t034_dYodFo
1.0K    ./ts_t034_E23G8h
1.0K    ./ts_t034_fKsSxh
1.0K    ./ts_t034_FV33Xt
1.0K    ./ts_t034_GkCJ9s
1.0K    ./ts_t034_GyVqgZ
1.0K    ./ts_t034_H4ADt0
1.0K    ./ts_t034_h5Atqy
1.0K    ./ts_t034_hEWadV
1.0K    ./ts_t034_J7XxLW
1.0K    ./ts_t034_jPtAWS
1.0K    ./ts_t034_JZAtdC
1.0K    ./ts_t034_KfeRz0
1.0K    ./ts_t034_L1Nc6f
1.0K    ./ts_t034_LQ81Du
1.0K    ./ts_t034_LwfbZv
1.0K    ./ts_t034_m1xo1v
1.0K    ./ts_t034_NcPEHs
1.0K    ./ts_t034_O9pjLY
1.0K    ./ts_t034_RdYIgk
1.0K    ./ts_t034_SbBYZy
1.0K    ./ts_t034_tYl5nT
1.0K    ./ts_t034_vh4E3t
1.0K    ./ts_t034_Y6bOmc
1.0K    ./ts_t034_yIPgVg
1.0K    ./ts_t034_YMHr1F
1.0K    ./ts_t034_zV8Mvh
1.0K    ./ts_t036_0p0LTe
1.0K    ./ts_t036_5OwDj9
1.0K    ./ts_t036_5SEfdQ
1.0K    ./ts_t036_7kIrY2
1.0K    ./ts_t036_7OTw7z
1.0K    ./ts_t036_7tfCeQ
1.0K    ./ts_t036_9QOq2A
1.0K    ./ts_t036_a1ZA7D
1.0K    ./ts_t036_alfU9m
1.0K    ./ts_t036_cncSax
1.0K    ./ts_t036_CZY0Jq
1.0K    ./ts_t036_D20zEb
1.0K    ./ts_t036_ddYIYB
1.0K    ./ts_t036_DmtKuy
1.0K    ./ts_t036_DUbv2b
1.0K    ./ts_t036_E8M0iW
1.0K    ./ts_t036_H1tFYJ
1.0K    ./ts_t036_h86GtI
1.0K    ./ts_t036_hlrwlx
1.0K    ./ts_t036_ifI4Pz
1.0K    ./ts_t036_KLKOI3
1.0K    ./ts_t036_LUcN5l
1.0K    ./ts_t036_McbcRN
1.0K    ./ts_t036_n5U5Tx
1.0K    ./ts_t036_NV5Fwy
1.0K    ./ts_t036_nYy4ny
1.0K    ./ts_t036_O99ceI
1.0K    ./ts_t036_oSQtQV
1.0K    ./ts_t036_OU5R0l
1.0K    ./ts_t036_P2EAUi
1.0K    ./ts_t036_R50CSJ
1.0K    ./ts_t036_RuSmfS
1.0K    ./ts_t036_si3xB1
1.0K    ./ts_t036_T0GYDQ
1.0K    ./ts_t036_tQOovK
1.0K    ./ts_t036_W4hnWA
1.0K    ./ts_t036_WeMJWM
1.0K    ./ts_t036_WyVph7
1.0K    ./ts_t036_X4A8OI
1.0K    ./ts_t036_xbJlfZ
1.0K    ./ts_t036_xHvk8T
1.0K    ./ts_t036_xqCxMf
1.0K    ./ts_t036_XTXrXp
1.0K    ./ts_t036_Y8kt7s
1.0K    ./ts_t037_43juK2
1.0K    ./ts_t037_9Wtd3M
1.0K    ./ts_t037_cR6siS
1.0K    ./ts_t037_EI0gQ0
1.0K    ./ts_t037_iLAvQq
1.0K    ./ts_t037_IohtuW
1.0K    ./ts_t037_j5RqFq
1.0K    ./ts_t037_JBMDPd
1.0K    ./ts_t037_Jgxxlc
1.0K    ./ts_t037_jpcZte
1.0K    ./ts_t037_L0AU9C
1.0K    ./ts_t037_l38WQH
1.0K    ./ts_t037_LDjVhm
1.0K    ./ts_t037_o21jMU
1.0K    ./ts_t037_OoFKkM
1.0K    ./ts_t037_PiQeLT
1.0K    ./ts_t037_Q4ZrQk
1.0K    ./ts_t037_qQpel4
1.0K    ./ts_t037_RabU5s
1.0K    ./ts_t037_Rx452j
1.0K    ./ts_t037_s4LTMb
1.0K    ./ts_t037_sdA5Y0
1.0K    ./ts_t037_ThOBg6
1.0K    ./ts_t037_TT6IgE
1.0K    ./ts_t037_ttqQBM
1.0K    ./ts_t037_TziMbP
1.0K    ./ts_t037_uAm3Bp
1.0K    ./ts_t037_Uyw9ze
1.0K    ./ts_t037_uZIWtb
1.0K    ./ts_t037_vioNJE
1.0K    ./ts_t037_VjAGfZ
1.0K    ./ts_t037_wG2zYu
1.0K    ./ts_t037_X1aGjP
1.0K    ./ts_t037_xGIy5Q
1.0K    ./ts_t037_XkRbRv
1.0K    ./ts_t037_z4wMwz
1.0K    ./ts_t038_3PTDJw
1.0K    ./ts_t038_9wHcXc
1.0K    ./ts_t038_bB66Fc
1.0K    ./ts_t038_dd8r8I
1.0K    ./ts_t038_ddLxL5
1.0K    ./ts_t038_fJImSD
1.0K    ./ts_t038_i48hti
1.0K    ./ts_t038_j5AhJX
1.0K    ./ts_t038_JihTQQ
1.0K    ./ts_t038_q4j5Up
1.0K    ./ts_t038_Rj8ieI
1.0K    ./ts_t038_s9IAZw
1.0K    ./ts_t038_v6we9m
1.0K    ./ts_t038_wk3sN2
1.0K    ./ts_t038_YszGGi
1.0K    ./ts_t038_Zc4XpY
1.0K    ./ts_t039_1ELJ53
1.0K    ./ts_t039_4B810v
1.0K    ./ts_t039_7CWsEv
1.0K    ./ts_t039_7qKh2c
1.0K    ./ts_t039_8TSpWr
1.0K    ./ts_t039_b6came
1.0K    ./ts_t039_Dh0O8J
1.0K    ./ts_t039_E3wdu1
1.0K    ./ts_t039_faChXh
1.0K    ./ts_t039_H2qNU9
1.0K    ./ts_t039_hCknqZ
1.0K    ./ts_t039_hEkicd
1.0K    ./ts_t039_iJO5Bw
1.0K    ./ts_t039_IVw9BY
1.0K    ./ts_t039_jYQytU
1.0K    ./ts_t039_M9svFx
1.0K    ./ts_t039_qEEymd
1.0K    ./ts_t039_qRSHxj
1.0K    ./ts_t039_qvo7If
1.0K    ./ts_t039_rb2SCb
1.0K    ./ts_t039_rtFrCq
1.0K    ./ts_t039_s2PGZm
1.0K    ./ts_t039_SRGsr2
1.0K    ./ts_t039_u2TvsT
1.0K    ./ts_t039_u8hrwG
1.0K    ./ts_t039_vOn96z
1.0K    ./ts_t039_w9d7Ak
1.0K    ./ts_t039_woqX8r
1.0K    ./ts_t039_wvBoyP
1.0K    ./ts_t039_xgmrLr
1.0K    ./ts_t039_xo5IYM
1.0K    ./ts_t039_YL1p7z
1.0K    ./ts_t03b_3qRhWr
1.0K    ./ts_t03b_4prEXv
1.0K    ./ts_t03b_5FJkKH
1.0K    ./ts_t03b_6zEEMP
1.0K    ./ts_t03b_7Onqc5
1.0K    ./ts_t03b_B3WdJf
1.0K    ./ts_t03b_beY4Pf
1.0K    ./ts_t03b_bU4tdR
1.0K    ./ts_t03b_cyMoMV
1.0K    ./ts_t03b_dIaB6Z
1.0K    ./ts_t03b_iqoimb
1.0K    ./ts_t03b_jokfrg
1.0K    ./ts_t03b_LhcBq8
1.0K    ./ts_t03b_lkc1vz
1.0K    ./ts_t03b_moNAoF
1.0K    ./ts_t03b_nUGRCH
1.0K    ./ts_t03b_osl5Mj
1.0K    ./ts_t03b_pDQ1ZJ
1.0K    ./ts_t03b_Q2unn3
1.0K    ./ts_t03b_rDL7p5
1.0K    ./ts_t03b_RYmeEj
1.0K    ./ts_t03b_U0tf3o
1.0K    ./ts_t03b_ubBcwN
1.0K    ./ts_t03b_VrTMsj
1.0K    ./ts_t03b_VtryRb
1.0K    ./ts_t03b_xBtK1c
1.0K    ./ts_t03b_YrjkeX
1.0K    ./ts_t03b_zG3qh5
1.0K    ./ts_t03d_6nq8pZ
1.0K    ./ts_t03d_9vSwKc
1.0K    ./ts_t03d_dBG3uQ
1.0K    ./ts_t03d_i9eDA5
1.0K    ./ts_t03d_J0MC7J
1.0K    ./ts_t03d_mlcfhB
1.0K    ./ts_t03d_mYjrf4
1.0K    ./ts_t03d_stDcTa
1.0K    ./ts_t03d_tFsgtr
1.0K    ./ts_t03d_Tt5X9q
1.0K    ./ts_t03d_WnoHB0
1.0K    ./ts_t03d_ZF2Kyl
1.0K    ./ts_t03w_A5Yeek
1.0K    ./ts_t03w_EowQQB
1.0K    ./ts_t03w_FSAcYl
1.0K    ./ts_t03w_IGujAP
1.0K    ./ts_t03w_K21vm2
1.0K    ./ts_t03w_NmMUfh
1.0K    ./ts_t03w_o0ojw9
1.0K    ./ts_t03w_rlZrzb
1.0K    ./ts_t03w_wRSS5n
1.0K    ./ts_t03w_xT4GjF
@plesager
Copy link
Owner

If there is a crash, you want to be able to look at the logs.
Corollary: if there is a crash, the user must cleanup manually after having examined the logs.
Why not delete $SCRATCH/tmp_ecearth3 all at once? Nothing there is supposed to be kept.

It is possible that clean up upon success is not perfect. Haven't check, but should be fixed if that's the case.

Also note #39, which is (remotely) related.

@etiennesky
Copy link
Contributor Author

Hi Philippe

I think the cleanup upon success is only missing one part, the removal of the directory.

about your corolarry... users are lazy so the files will usually be left behind.

you suggestion to delete $SCRATCH/tmp_ecearth3 all at once is tempting, but you might end up deleting files during an ece3-postproc process is active

Not sure what is the best approach.

@etiennesky
Copy link
Contributor Author

I think the best approach is to have a tmpdir which is unique for the experiment, so you can delete it when you are finished

@plesager
Copy link
Owner

Yes, probably the easiest way to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants