-
-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VSIZIP and VSISTDIN are not compatible #751
Comments
There is no way that /vsistdin/ and /vsizip/ can be combined together. Reading from a zip requires to read its end first. Whereas /vsistdin/ reading must be sequential from the beginning. So that "works" as expected. |
I'm running into this as well.
Would it not be possible to read stdin into a buffer, and then you can read to the end of that buffer? I guess if I'm going to flippantly suggest that something might or might not be possible, I should probably take a quick peek at the code and see whether it looks imposing. :)
...and they work by implementing subclasses of class VSIStdinFilesystemHandler final : public VSIFilesystemHandler
class VSIStdinHandle final : public VSIVirtualHandle
class VSIMemFilesystemHandler final : public VSIFilesystemHandler
class VSIMemHandle final : public VSIVirtualHandle ...and the
...so okay, it sounds like the first MB of stdin is indeed read into a buffer. // We buffer the first 1MB of standard input to enable drivers
// to autodetect data. In the first MB, backward and forward seeking
// is allowed, after only forward seeking will work.
// TODO(schwehr): Make BUFFER_SIZE a static const.
#define BUFFER_SIZE (1024 * 1024)
static GByte* pabyBuffer = nullptr;
static GUInt32 nBufferLen = 0;
static GUIntBig nRealPos = 0;
/// ...and later, in VSIStdinInit:
pabyBuffer = static_cast<GByte *>(CPLMalloc(BUFFER_SIZE)); And for completeness' sake, the source for /vsizip/ is here: ...and I see that So in order to allow reading ZIP files from stdin, I guess one of the following would be necessary:
I suppose there is no great cry for those at the moment, and so no plan for them. Hmmmm. |
well, one could potentially modify the code to replace the hard coded 1MB buffer by something configurable with a VSISTDIN_BUFFER_LIMIT configuration option that would default to 1MB and could be set to another value, or -1 to indicate no limit. Or perhaps accept /vsistdin?buffer_limit=.... (would require looking for the use of "/vsistdin/" in the code base, particularly drivers, to make it more flexible) |
Wow, I didn't think the VSI syntax was so flexible. I guess it can be whatever you make it, though. :)
I feel like specifying a higher, but still finite, limit isn't a feature which people will generally need. Hmmm, but I guess that doesn't actually make sense, because of the following scenario:
...so I guess you would need two separate modes: the current, "buffer 1MB but then throw away data after you've read it" mode, and a separate, "keep a dynamically resized buffer containing all data ever read" mode. So then yes, I think your suggestion makes a lot of sense: |
/vsistdin/: make size of buffered area configurable (fixes #751)
Thanks @rouault! |
The code for /VSISTDIN/ only kicks in if the entire path provided matches.
This means that if you try and do something like
It does not work, like it would if we pulled the file down with /vsicurl/ or similar. Despite the documentation on chaining ( http://www.gdal.org/gdal_virtual_file_systems.html#gdal_virtual_file_systems_chaining ) being contradicted by the section on /vsistdin/ ( http://www.gdal.org/gdal_virtual_file_systems.html#gdal_virtual_file_systems_vsistdin )
While the example is easy enough to work around by using the standard file path. We are tying to use GDAL with Apache Nifi which passes files to stdin of sub-processes. This means we need to write the file to disk first which can be a bit tricky with clustered environments.
Expected behavior and actual behavior.
Something like that prints out the info for the correct image in the zip file.
Instead we get:
Operating system
Tested on Windows 10 and Ubuntu 16.04
GDAL version and provenance
PS C:\data> gdalinfo --version
GDAL 2.3.0, released 2018/05/04
Code checked against Master ( 448138d)
The text was updated successfully, but these errors were encountered: