Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python][CI] test_download_tzdata_on_windows fails on Windows wheels due to CERTIFICATE_VERIFY_FAILED #45295

Closed
raulcd opened this issue Jan 17, 2025 · 4 comments

Comments

@raulcd
Copy link
Member

raulcd commented Jan 17, 2025

Describe the bug, including details regarding any error messages, version, and platform.

Windows wheels have started failing on the 15th of January failing to validate a certificate when downloading https://data.iana.org/time-zones/tzdata-latest.tar.gz

Part of the log below:

 _______________________ test_download_tzdata_on_windows _______________________
...

self = <ssl.SSLSocket [closed] fd=-1, family=2, type=1, proto=0>, block = False

    @_sslcopydoc
    def do_handshake(self, block=False):
        self._check_connected()
        timeout = self.gettimeout()
        try:
            if timeout == 0.0 and block:
                self.settimeout(None)
>           self._sslobj.do_handshake()
E           ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:1006)

Python311\Lib\ssl.py:1382: SSLCertVerificationError

During handling of the above exception, another exception occurred:

    @pytest.mark.skipif(sys.platform != "win32",
                        reason="Timezone database is already provided.")
    def test_download_tzdata_on_windows():
        tzdata_path = os.path.expandvars(r"%USERPROFILE%\Downloads\tzdata")
    
        # Download timezone database and remove data in case it already exists
        if (os.path.exists(tzdata_path)):
            shutil.rmtree(tzdata_path)
>       download_tzdata_on_windows()

Python311\Lib\site-packages\pyarrow\tests\test_util.py:223: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Python311\Lib\site-packages\pyarrow\util.py:248: in download_tzdata_on_windows
    with urlopen('https://data.iana.org/time-zones/tzdata-latest.tar.gz') as response:
Python311\Lib\urllib\request.py:216: in urlopen
    return opener.open(url, data, timeout)
Python311\Lib\urllib\request.py:519: in open
    response = self._open(req, data)
Python311\Lib\urllib\request.py:536: in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
Python311\Lib\urllib\request.py:496: in _call_chain
    result = func(*args)
Python311\Lib\urllib\request.py:1391: in https_open
    return self.do_open(http.client.HTTPSConnection, req,
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

Component(s)

Python, Continuous Integration

@kou kou changed the title [Python[CI] test_download_tzdata_on_windows fails on Windows wheels due to CERTIFICATE_VERIFY_FAILED [Python][CI] test_download_tzdata_on_windows fails on Windows wheels due to CERTIFICATE_VERIFY_FAILED Jan 18, 2025
@raulcd
Copy link
Member Author

raulcd commented Jan 21, 2025

This might be solved upstream as the latest nightly wheels were successful. There might have been a real issue with the certificate for that URL.
I am closing it as won't fix. If it happens again we can re-open.

@amoeba
Copy link
Member

amoeba commented Feb 5, 2025

After reopening this and putting up a PR, I realize the issue here is only related but not the same as what's being fixed in #45425. Should I open a new issue?

amoeba added a commit that referenced this issue Feb 11, 2025
…d use tzdata package for tzinfo database on Windows for ORC (#45425)

### Rationale for this change

We have two Windows issues and this PR is addressing both:

1. PyArrow's `download_tzdata_on_windows` can fail due to TLS issues in certain CI environments.
2. The Python wheel test infrastructure needs a tzinfo database for ORC and the automation fetching that started failing because the URL was made invalid upstream.

These two issues are being solved in one PR simply because they appeared together during the 19.0.1 release process but they're separate.

### What changes are included in this PR?

1. Makes `download_tzdata_on_windows` more robust to TLS errors by attempting to use `requests` if it's available and falling back to urllib otherwise.
2. Switches our Windows wheel test infrastructure to grab a tzinfo database from the tzdata package on PyPi instead of from a mirror URL. This should be much more stable for us over time.

### Are these changes tested?

Yes.

### Are there any user-facing changes?

No.
* GitHub Issue: #45295

Lead-authored-by: Bryce Mecum <petridish@gmail.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Bryce Mecum <petridish@gmail.com>
@amoeba
Copy link
Member

amoeba commented Feb 11, 2025

Issue resolved by pull request 45425
#45425

@amoeba amoeba added this to the 19.0.1 milestone Feb 11, 2025
@amoeba amoeba closed this as completed Feb 11, 2025
amoeba added a commit to amoeba/arrow that referenced this issue Feb 11, 2025
…ust and use tzdata package for tzinfo database on Windows for ORC (apache#45425)

We have two Windows issues and this PR is addressing both:

1. PyArrow's `download_tzdata_on_windows` can fail due to TLS issues in certain CI environments.
2. The Python wheel test infrastructure needs a tzinfo database for ORC and the automation fetching that started failing because the URL was made invalid upstream.

These two issues are being solved in one PR simply because they appeared together during the 19.0.1 release process but they're separate.

1. Makes `download_tzdata_on_windows` more robust to TLS errors by attempting to use `requests` if it's available and falling back to urllib otherwise.
2. Switches our Windows wheel test infrastructure to grab a tzinfo database from the tzdata package on PyPi instead of from a mirror URL. This should be much more stable for us over time.

Yes.

No.
* GitHub Issue: apache#45295

Lead-authored-by: Bryce Mecum <petridish@gmail.com>
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
Signed-off-by: Bryce Mecum <petridish@gmail.com>
@pitrou
Copy link
Member

pitrou commented Feb 17, 2025

@github-actions crossbow submit wheel-windows-cp310-cp310-amd64

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants