Add section headings and row-by-row construction example #1416

oxinabox · 2018-06-04T06:07:42Z

I think it is easier to navigate with section headings.

And the row-by-row construction is something I always forget how to do.
(and I think more useful than column by column)

nalimilan · 2018-06-04T20:20:53Z

I'm not sure we should recommend creating data frames row by row, as it's really inefficient. At least we should explain how to create it by columns first.

Can you also fix the typography (code in backticks, spaces in "data frame", consistency in case, etc.)?

oxinabox · 2018-06-05T03:01:55Z

I'm not sure we should recommend creating data frames row by row, as it's really inefficient. At least we should explain how to create it by columns first.

Here is what I am thinking:
3 ways to construct a DataFrame.

All at once
Row By Row
Column By Column

Now let me detail (what I think) the use case for each is:

All at once

This is basically only for if you are loading data generated else-where.
The actual constructor (all at once) is just turning data in one form into a dataframe.
It is basically indended to be supplied its args by a loading packages like CSV.jl etc.

Row by Row

To me, row by row is the only way anyone ever would create a dataframe live.
One row represents the results of one simulation.

While appending a row is slow it is vanishingly small compared to the simulation time.
If a simulation takes 10 minutes to run, and appending a row takes n_rows^2 microseconds,
it just doesn't matter, because by the time I have enough rows for the time factor to reach a factor of the simulation time...

The possible exception to this being the only way,
is "Predeclared Row by Row", where one initially constructs the whole dataframe but with all the values missing,
then fills in the blanks by running simulations.
But since there is no example of that I'll leave it out of here.

Column by Column

This isn't actually used to construct a dataframe.
It is used to enrich a dataframe, based on the information already in it.
For example:

students[:zscore] = (students[:score] .- mean(students[:score])) ./ std(students[:score])
students[:honorslist] = students[:zscore]  .> 2

Maybe, a possible case where is is used to actually construct a dataframe might be if you are pulling columns out of a database, based on the existing df column (or Vector) of keys.
Though that is not too different from my example of standardizing scores.

nalimilan · 2018-06-05T12:46:36Z

I still disagree. The presentation logically starts by presenting the main DataFrame constructor, which takes a series of columns. And it happens in real life to assemble a few vectors in a DataFrame. OTC, constructing a data frame row by row requires using push!, so it should come after, with a mention regarding poor performance (which can matter since the process generating rows does not always take 10 minutes).

oxinabox · 2018-06-05T12:55:17Z

Alright, I've said my piece and failed to convince you, so fair enough.
Changed.

nalimilan · 2018-06-05T13:06:26Z

Thanks. Can you also fix the spacing and syntax?

oxinabox · 2018-06-16T05:33:45Z

I think this is all good now?

nalimilan · 2018-06-16T13:22:05Z

Sorry, there are still lots of typos and of inconsistencies in casing and in the way DataFrame is written.

oxinabox · 2018-06-17T02:33:44Z

I'll give it another check over.
Writing without typos is something I am really bad at.
(There are actual reasons for that but not relevant here)

oxinabox · 2018-06-28T02:21:52Z

bump

oxinabox · 2018-07-02T08:19:56Z

What do I need to do?

nalimilan · 2018-07-02T08:33:26Z

There are still typos, weird uses of semicolons, inconsistent casing in headings and missing blank lines after headings. Plus lines should be under 92 characters.

oxinabox · 2018-07-02T09:07:08Z

There are still typos,

Hopefully I've got them all now

weird uses of semicolons,

~~I have not added any semicolons at all AFAICT. Maybe you mean colons? I do tend to over use those.~~ (idk how I missed them) Fixed

inconsistent casing in headings

Ok, I've made all the heading start every words with upper-case.

and missing blank lines after headings.

Fixed.

Plus lines should be under 92 characters.

I was under the impression that that convention is not being followed for this file.
There are 24 lines on master that are over 93+ characters.
In this PR there are 27 lines.

I feel like line-breaking the whole file can be its own PR.

oxinabox · 2018-07-02T12:27:20Z

Thanks.
I think this is one of those "It is easier to fix than to explain how to fix" situations.

)

Add section headings and row-by-row construction

5791d5e

selling

00dc851

columns first

1e4932c

space around operators

858f04b

typos fix

7184a40

Update getting_started.md

375fbdb

oxinabox and others added 2 commits July 2, 2018 17:30

Remove semicolons

a3e30d6

Fixes

6251914

More fixes

4b82abd

nalimilan approved these changes Jul 2, 2018

View reviewed changes

Merge branch 'master' into patch-2

d7ed385

nalimilan merged commit e731982 into JuliaData:master Jul 11, 2018

pdeffebach pushed a commit to pdeffebach/DataFrames.jl that referenced this pull request Jul 14, 2018

Add section headings and row-by-row construction example (JuliaData#1416

12d9835

)

oxinabox mentioned this pull request Oct 1, 2020

[BREAKING] deprecate DataFrame constructors #2464

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add section headings and row-by-row construction example #1416

Add section headings and row-by-row construction example #1416

oxinabox commented Jun 4, 2018

nalimilan commented Jun 4, 2018

oxinabox commented Jun 5, 2018

nalimilan commented Jun 5, 2018

oxinabox commented Jun 5, 2018

nalimilan commented Jun 5, 2018

oxinabox commented Jun 16, 2018

nalimilan commented Jun 16, 2018

oxinabox commented Jun 17, 2018

oxinabox commented Jun 28, 2018

oxinabox commented Jul 2, 2018

nalimilan commented Jul 2, 2018

oxinabox commented Jul 2, 2018 •

edited

Loading

oxinabox commented Jul 2, 2018

Add section headings and row-by-row construction example #1416

Add section headings and row-by-row construction example #1416

Conversation

oxinabox commented Jun 4, 2018

nalimilan commented Jun 4, 2018

oxinabox commented Jun 5, 2018

All at once

Row by Row

Column by Column

nalimilan commented Jun 5, 2018

oxinabox commented Jun 5, 2018

nalimilan commented Jun 5, 2018

oxinabox commented Jun 16, 2018

nalimilan commented Jun 16, 2018

oxinabox commented Jun 17, 2018

oxinabox commented Jun 28, 2018

oxinabox commented Jul 2, 2018

nalimilan commented Jul 2, 2018

oxinabox commented Jul 2, 2018 • edited Loading

oxinabox commented Jul 2, 2018

oxinabox commented Jul 2, 2018 •

edited

Loading