You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The zipfile.ZipInfo.from_file method takes a native filename, and an optional name to be placed in the archive. However, it uses a converted form of the native filename to the ZipInfo() constructor without turning it into the correct form for an archive name.
A native filename is equivalent to a name that can be put into an archive. This is clearly not true on all systems.
The archive name supplied will conform to the filesystem normalisation rules for the system it is running on. Again, this will not be true on filesystems that do not honour the same normalisation rules.
I believe the code should explicitly convert the filename to a form which is suitable for use as an archive name, rather than assuming that the name given is suitable.
A real world example on a system which uses a different scheme for filename separators is RISC OS, which has an os.sep of .. In such a case you might make a call to place a file in the archive with something like:
The expectation would be that this would create a ZipInfo object with an arcname of Application/Directory/Filename.txt. However, this will actually use the arcname Application.Directory.Filename/txt - which isn't at all what was intended or useful.
I believe the code would be better handled as something like this...
# Create ZipInfo instance to store file informationifarcnameisNone:
# The filename is still in native filename format, so we must convert to# the form used by the archive.arcname=os.path.normpath(os.path.splitdrive(filename)[1])
# Normalise the separator to os.sepifos.altsep:
arcname=arcname.replace(os.altsep, os.sep)
# Ensure that the directory and extension separators are in arcname formatifos.sep!='/'oros.extsep!='.':
parts=arcname.split(os.sep)
ifos.extsep!='.':
parts= [part.replace(os.extsep, '.') forpartinparts]
arcname='/'.join(parts)
whilearcnameandarcname[0] =='/':
arcname=arcname[1:]
This would...
Only perform the normalisation through the filesystem functions when a filename is given. That is, if you supply an arcname that's what you get, unaffected by the native platform's filesystem semantics.
Converts the filesystem semantics (from os.sep, os.altsep and os.extsep) to arcname semantics. We normalise from altsep to sep to simplify things, and then construct a name which uses the correct arcname convention from the known parts of the name.
Only performs the conversion if they would not be identity operations, so we're efficient on systems where the name is a pass through.
Retain the stripping of leading / characters in the archive, so that we don't create them in an odd place (although it's arguable this should be in the ZipInfo constructor to prevent overwriting system files, etc).
Ensure that ZipInfo is always called with a name which is an arcname, as that's what it expects anyhow (it has a little allowance for names in native form but this only ever worked on Windows and should not be relied on because on other systems it can never work, see ZipInfo filename is mangled when os.sep is not '/' #94529)
Retain as much of the existing behaviour as was actually reliable and deterministic as possible.
Your environment
CPython versions tested on: Python 3.9, Python 3.10
Operating system and architecture: OSX, RISC OS
The text was updated successfully, but these errors were encountered:
Bug report
The
zipfile.ZipInfo.from_file
method takes a native filename, and an optional name to be placed in the archive. However, it uses a converted form of the native filename to theZipInfo()
constructor without turning it into the correct form for an archive name.The code in question is:
cpython/Lib/zipfile.py
Lines 526 to 541 in 7db1d2e
I believe that code is assuming that:
ZipInfo()
will know that a native filename can be converted to an archive name (this is only true on POSIX and Windows systems, see ZipInfo filename is mangled when os.sep is not '/' #94529), and so is a bad assumptionI believe the code should explicitly convert the filename to a form which is suitable for use as an archive name, rather than assuming that the name given is suitable.
A real world example on a system which uses a different scheme for filename separators is RISC OS, which has an
os.sep
of.
. In such a case you might make a call to place a file in the archive with something like:The expectation would be that this would create a ZipInfo object with an arcname of
Application/Directory/Filename.txt
. However, this will actually use the arcnameApplication.Directory.Filename/txt
- which isn't at all what was intended or useful.I believe the code would be better handled as something like this...
This would...
os.sep
,os.altsep
andos.extsep
) to arcname semantics. We normalise fromaltsep
tosep
to simplify things, and then construct a name which uses the correct arcname convention from the known parts of the name./
characters in the archive, so that we don't create them in an odd place (although it's arguable this should be in the ZipInfo constructor to prevent overwriting system files, etc).Your environment
The text was updated successfully, but these errors were encountered: