DOC: read_excel supports skiprows argument like read_csv, but tests and docs needed #36435

Dr-Irv · 2020-09-17T21:05:54Z

Location of the documentation

https://pandas.pydata.org/docs/reference/api/pandas.read_excel.html
https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html

Documentation problem

In read_csv, we say that we support:
skiprows list-like, int or callable, optional

Line numbers to skip (0-indexed) or number of lines to skip (int) at the start of the file.

If callable, the callable function will be evaluated against the row indices, returning True
if the row should be skipped and False otherwise. An example of a valid callable argument
would be lambda x: x in [0, 2].

In read_excel, we say that we support:

skiprows list-like

Rows to skip at the beginning (0-indexed).

It turns out that the int and callable arguments work fine with read_excel(), so we should indicate that in the documentation.
We also need to add tests for those 2 cases.

Suggested fix for documentation and additional tests

copy the read_csv doc for skiprows over to read_excel
add tests to tests/io/excel/test_readers.py for the int and callable options

The text was updated successfully, but these errors were encountered:

ahgamut · 2020-09-17T22:00:11Z

I'd like to try this out. How do I assign the issue to myself?

ahgamut · 2020-09-17T22:00:17Z

take

ahgamut · 2020-09-18T02:48:06Z

@Dr-Irv I've submitted a PR (#36437) for this, and it has passed all CI checks except one: the travis-ci check is failing due to a stalled build (?).

conda env create -q --file=ci/deps/travis-37-cov.yaml

Collecting package metadata (repodata.json): ...working... done

Solving environment: ...working... 

No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.

Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#build-times-out-because-no-output-was-received

The build has been terminated

) * DOC: updated read_excel skiprows documentation to match read_csv (GH36435) * TST: updated read_excel test with skiprows as int, callable (GH36435)

…6435) (pandas-dev#36437) * DOC: updated read_excel skiprows documentation to match read_csv (GH36435) * TST: updated read_excel test with skiprows as int, callable (GH36435)

Dr-Irv added Docs IO Excel read_excel, to_excel good first issue Needs Tests Unit test(s) needed to prevent regressions labels Sep 17, 2020

Dr-Irv mentioned this issue Sep 17, 2020

pd.read_excel() has some parameters incorrectly documented microsoft/pylance-release#385

Closed

github-actions bot assigned ahgamut Sep 17, 2020

ahgamut mentioned this issue Sep 18, 2020

DOC: read_excel skiprows documentation matches read_csv (#36435) #36437

Merged

5 tasks

Dr-Irv closed this as completed in #36437 Sep 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOC: read_excel supports skiprows argument like read_csv, but tests and docs needed #36435

DOC: read_excel supports skiprows argument like read_csv, but tests and docs needed #36435

Dr-Irv commented Sep 17, 2020

ahgamut commented Sep 17, 2020

ahgamut commented Sep 17, 2020

ahgamut commented Sep 18, 2020

DOC: read_excel supports skiprows argument like read_csv, but tests and docs needed #36435

DOC: read_excel supports skiprows argument like read_csv, but tests and docs needed #36435

Comments

Dr-Irv commented Sep 17, 2020

Location of the documentation

Documentation problem

Suggested fix for documentation and additional tests

ahgamut commented Sep 17, 2020

ahgamut commented Sep 17, 2020

ahgamut commented Sep 18, 2020