-
-
Notifications
You must be signed in to change notification settings - Fork 18.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: output formatting with to_html(), index=False and/or index_names=False (#22579, #22747) #22655
Conversation
Hello @simonjayhawkins! Thanks for updating the PR. Cheers ! There are no PEP8 issues in this Pull Request. 🍻 Comment last updated on December 28, 2018 at 15:37 Hours UTC |
Codecov Report
@@ Coverage Diff @@
## master #22655 +/- ##
==========================================
+ Coverage 92.17% 92.18% +<.01%
==========================================
Files 169 169
Lines 50708 50697 -11
==========================================
- Hits 46740 46734 -6
+ Misses 3968 3963 -5
Continue to review full report at Codecov.
|
Codecov Report
@@ Coverage Diff @@
## master #22655 +/- ##
==========================================
+ Coverage 92.29% 92.29% +<.01%
==========================================
Files 163 163
Lines 51948 51956 +8
==========================================
+ Hits 47945 47953 +8
Misses 4003 4003
Continue to review full report at Codecov.
|
pandas/io/formats/html.py
Outdated
# Determine if ANY column names need to be displayed | ||
# since if the row index is not displayed a column of | ||
# blank cells need to be included before the DataFrame values. | ||
self.show_col_idx_names = all((self.fmt.has_column_names, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for questions but still trying to wrap my head around implementation. Based off of the comment, why is this all
here and not any
? Wouldn't any of these require there to be a cell where a column index name would be placed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
index=False
with a single level row index and multi-level columns index with named columns but not all named...
index = pd.MultiIndex.from_product([['a','b'], ['c','d'], ['e','f']], names=[
'foo',None, 'baz'])
df = pd.DataFrame(np.arange(64).reshape(8,8), columns=index)
result = df.to_html(max_rows=4, max_cols=4, index=False)
print(result)
foo | a | ... | b | ||
---|---|---|---|---|---|
c | ... | d | |||
baz | e | f | ... | e | f |
0 | 1 | 6 | 7 | ||
8 | 9 | 14 | 15 | ||
48 | 49 | 54 | 55 | ||
56 | 57 | 62 | 63 |
Note: missing truncation indicators in data now fixed in master.
the misalignment of the column names is due to the logic being applied within the level generating loop..
pandas/pandas/io/formats/html.py
Lines 270 to 275 in d43ac97
name = self.columns.names[lnum] | |
row = [''] * (row_levels - 1) + ['' if name is None else | |
pprint_thing(name)] | |
if row == [""] and self.fmt.index is False: | |
row = [] |
hence class-level variable needed to check if ANY names need to be displayed to determine alignment.
ALL condition is to determine in ANY names should be displayed given the to_html
parameters and uses similar logic as to_string
etc.
pandas/pandas/io/formats/format.py
Lines 796 to 803 in d43ac97
def _get_formatted_index(self, frame): | |
# Note: this is only used by to_string() and to_latex(), not by | |
# to_html(). | |
index = frame.index | |
columns = frame.columns | |
show_index_names = self.show_index_names and self.has_index_names | |
show_col_names = (self.show_index_names and self.has_column_names) |
and the rows in to_html
..
pandas/pandas/io/formats/html.py
Lines 307 to 309 in d43ac97
if all((self.fmt.has_index_names, | |
self.fmt.index, | |
self.fmt.show_index_names)): |
There is currently no test to explicitly cover this example. so i think the best way forward is to fully parameterize the truncation tests in line with the parametrized basic_alignment tests for added assurance.
i'll make show_col_idx_names
a class property for clarity and add a note to refactor and 'inherit' from DataFrameFormatter
class. inherit quoted since HTMLFormatter
class is not directly inherited from DataFrameFormatter
. in the first refactor just use mock inheritence like..
pandas/pandas/io/formats/html.py
Lines 46 to 48 in d43ac97
@property | |
def is_truncated(self): | |
return self.fmt.is_truncated |
@jreback @WillAyd with the additional parameterization of the truncation tests, we now have test coverage for multi-indexes with more than 2 rows, missing column index names and truncation with standard row indexes. There is now test coverage in place allowing the refactoring of |
I would make a sub-dir of data/html to hold all of this test data (and move the original .html files as well). |
@WillAyd over to you |
Thanks @simonjayhawkins ! |
thanks @simonjayhawkins ! |
* upstream/master: BUG: output formatting with to_html(), index=False and/or index_names=False (pandas-dev#22579, pandas-dev#22747) (pandas-dev#22655) MAINT: Port _timelex in codebase (pandas-dev#24520) Implement unique+array parts of 24024 (pandas-dev#24527) Integer NA docs (pandas-dev#23617)
git diff upstream/master -u -- "*.py" | flake8 --diff