-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simple find() without delimiters for recursive rm #280
Conversation
I would appreciate trials here, before bothering rerecording the tests |
Quick test regarding #279 : It manages to work way faster then it used to be (large recursive listing was very slow before). However it still fails to delete:
|
The "no such object" is mentioned in the description above. I wonder if we should silently ignore these, since the "root" of the delete tree may or may not exist as a key. fsspec/filesystem_spec#391 will solve the "gs://" thing |
Not sure how it supposed to work. Remove usually works unless the specified inputs don't exists. So I think as long as the input paths exists every other not found should be catched? |
Right, but "exists" is a little subtle a meaning here. If we find any paths, then the root path exists, but doesn't necessarily need to be deleted (because if it only an implied prefix, it disappears when all the contents are gone). |
What I mean: as long as the paths exists before deleting anything it's fine. Any "not found" during deleting is a success as the goal is successfull |
OK, that makes sense! So raise NotFound in the case that no paths were retrieved, ignore any no-such errors. I'll make that happen. |
That makes sense to me at least, in the case the app doesn't care if the path existed before they can always catch that initial not found. |
Will this also work for other recursive listings? I have some code which lists everything, currently with a direct library usage, but using fsspec would be great (but my test with 0.6.2 made it run very slow). |
It will affect things that call find. Note that the code here does not populate the directory cache, so if you then go to open/list/etc on the files, you will not have gained anything. |
In my case that's actually good. I use it currently in combination with pyspark, so the cache won't be up-to-date anyway |
@pvanderlinden , would you mind testing again? |
@martindurant this works good when passing in without protocol, expand_path will add the original path with protocol to the list though, which causes the error (line 737 in core.py). |
The expand_path issue is fixed in fsspec master |
@mariusvniekerk , do you think this should be ported to s3fs too? |
Sorry, I missed that. It works fine with master indeed. |
Yeah it would be handy in s3fs. |
Fixes #273
Fixes #279
(needs rerecording)
This method is efficient and works in general. In the case that you recursive-delete a directory (i.e., within a bucket) with content, but where a key of the same name as the directory does not exist, you will get an exception for that path (but process succeeds)