-
Notifications
You must be signed in to change notification settings - Fork 771
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
pandas pd.read_excel() with sheet_name=None returns wrong result, and dtype does not allow nonstring dtypes #449
Comments
I reported the above with version 2020.9.7, and you put the label "fixed in next version" on it, but now it wasn't fixed in 2020.9.8. In the meantime, all of the use cases below should work correctly when you get this right. The fix I listed above doesn't handle them all. import pandas as pd
from typing import Dict
dfd: Dict[str, pd.DataFrame] = pd.read_excel("simple.xlsx", sheet_name=None)
print(dfd)
dfd2: pd.DataFrame = pd.read_excel("simple.xlsx", dtype=object)
print(dfd2)
print(dfd2.dtypes)
dfd3: pd.DataFrame = pd.read_excel("simple.xlsx", sheet_name="Sheet1")
print(dfd3)
dfd4: Dict[int, pd.DataFrame] = pd.read_excel("simple.xlsx", sheet_name=[0])
print(dfd4)
dfd5: Dict[str, pd.DataFrame] = pd.read_excel("simple.xlsx", sheet_name=["Sheet1"])
print(dfd5) |
This change didn't made it in to last week's release on Wednesday. We also did a hot-fix release on Friday to fix a specific regression, but that was a very targeted fix. This issues will be addressed in this week's release (typically on Wednesdays). We will mark the issue as closed once it is included in a release. It will also appear in the release notes. |
Just make sure that the 5 test cases I just listed are covered, which are a superset of what was originally reported. |
All of those pass in "basic" in our current unreleased state except for the third. Maybe because the I'm not really sure how we were supposed to have fixed a superset of the reported issues, though, if by definition that means some issues were not included... |
I've got a set of changes that work for everything except the fourth one, which is less likely to be used (IMHO). In trying to get it to work, I uncovered the other issues. Let me know how you want to manage. I think the definition of |
List is a bad parameter type, because it requires that it explicitly be of the |
Looking at the pandas source for |
Thanks for checking; we can flip that back in read_excel. |
That should be fixed in the next release; all five examples typecheck. |
This issue has been fixed in version 2020.10.0, which we've just released. You can find the changelog here: https://github.com/microsoft/pylance-release/blob/master/CHANGELOG.md#2020100-7-october-2020 |
Environment data
Expected behaviour
No error
Actual behaviour
reports
Two issues here:
sheet_name=None
pandas will always return a Dict with the name of the sheets as the key, which is a string. This is true even if there is only one sheet. Only timeint
is returned is when you ask for explicit sheet numbers (I think).dtype
parameter can be any valid typePossible fix:
The text was updated successfully, but these errors were encountered: