-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnsupportedOperation 'seek' when loading excel files from url #20434
Comments
This patch fixes the failing tests on my machine: @@ -10,6 +10,7 @@ import os
import abc
import warnings
import numpy as np
+from http.client import HTTPResponse
from pandas.core.dtypes.common import (
is_integer, is_float,
@@ -387,7 +388,9 @@ class ExcelFile(object):
self.book = io
elif not isinstance(io, xlrd.Book) and hasattr(io, "read"):
# N.B. xlrd.Book has a read attribute too
- if hasattr(io, 'seek'):
+ #
+ # http.client.HTTPResponse.seek() -> UnsupportedOperation exception
+ if not isinstance(io, HTTPResponse) and hasattr(io, 'seek'):
# GH 19779
io.seek(0) Do you think it's sufficient for resolving this issue? I would like to provide my first pull request. Or is this issue more related to my setup? I'm completely new to pandas development and I've just prepared a working environment following the guide Contributing to pandas. Thanks, |
Closes pandas-dev#20434. Back in pandas-dev#19779 a call of a seek() method was added. This call fails on HTTPResponse instances with an UnsupportedOperation exception, so for this case a try..except wrapper was added here.
Closes pandas-dev#20434. Back in pandas-dev#19779 a call of a seek() method was added. This call fails on HTTPResponse instances with an UnsupportedOperation exception, so for this case a try..except wrapper was added here.
I can reproduce this issue including the failure of the existing tests on another Linux machine: git clone https://github.com/mcrot/pandas.git pandas-mcrot
cd pandas-mcrot
git remote add upstream https://github.com/pandas-dev/pandas.git
conda update conda
conda env create -f ci/environment-dev.yaml
source activate pandas-dev
python setup.py build_ext --inplace -j 4
python -m pip install -e .
conda install -c defaults -c conda-forge --file=ci/requirements-optional-conda.txt
pytest pandas/tests/io/test_excel.py Result:
Output of INSTALLED VERSIONScommit: 01882ba pandas: 0.23.0.dev0+657.g01882ba |
Is there anyone who can confirm this issue? Probably not since there would be a lot failures of test runs because of this. Alternatively, does anyone has an idea what could be wrong with my setup, e.g. do you have a significant difference in the output for |
I can confirm that I get a failure locally. |
Closes pandas-dev#20434. Back in pandas-dev#19779 a call of a seek() method was added. This call fails on HTTPResponse instances with an UnsupportedOperation exception, so for this case a try..except wrapper was added here.
Code Sample, a copy-pastable example if possible
Problem description
In my version
0.23.0.dev0+657.g01882ba
I get anUnsupportedOperation
:The PR #19926 was made in order to fix #19779. It introduced a call of seek() method
only for objects having a that method. In case of giving a URL to
pandas.read_excel()
,seek()
is called on aHTTPResponse
instance and it seems like that it does not support seeking, although the methodseek()
is available.This issue is already covered by a test. When running
the test method
TestXlrdReader.test_read_from_http_url
fails for the same reason.Expected Output
In version 0.22 this code returns the data as expected:
Output of
pd.show_versions()
INSTALLED VERSIONS
commit: 01882ba
python: 3.6.4.final.0
python-bits: 64
OS: Linux
OS-release: 4.4.0-116-generic
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: de_DE.UTF-8
LOCALE: de_DE.UTF-8
pandas: 0.23.0.dev0+657.g01882ba
pytest: 3.4.2
pip: 9.0.1
setuptools: 38.5.1
Cython: 0.27.3
numpy: 1.14.2
scipy: 1.0.0
pyarrow: 0.8.0
xarray: 0.10.2
IPython: 6.2.1
sphinx: 1.7.1
patsy: 0.5.0
dateutil: 2.7.0
pytz: 2018.3
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.4
feather: 0.4.0
matplotlib: 2.2.2
openpyxl: 2.5.0
xlrd: 1.1.0
xlwt: 1.3.0
xlsxwriter: 1.0.2
lxml: 4.1.1
bs4: 4.6.0
html5lib: 1.0.1
sqlalchemy: 1.2.5
pymysql: 0.8.0
psycopg2: None
jinja2: 2.10
s3fs: 0.1.3
fastparquet: 0.1.4
pandas_gbq: None
pandas_datareader: None
The text was updated successfully, but these errors were encountered: