Skip to content

Commit c313f84

Browse files
committed
adding lessons for main part of course
1 parent 1e73ec2 commit c313f84

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

80 files changed

+10639
-141
lines changed

_assignments_hold/assignment_10.Rmd

+41
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
---
2+
layout: lesson
3+
title: Assignment 10
4+
subtitle: EDH7916
5+
author: Benjamin Skinner
6+
order: 10
7+
category: problemset
8+
links:
9+
pdf: assignment_10.pdf
10+
output:
11+
md_document:
12+
variant: gfm
13+
preserve_yaml: true
14+
---
15+
16+
I have been opinionated throughout this course (and in lesson 10 in
17+
particular) about the best ways to organize a quantitative data
18+
workflow. Considering all of that, please answer the following two
19+
questions in a Markdown (`*.md`) or RMarkdown (`*.Rmd`) file, giving
20+
about half of a page to each answer:
21+
22+
1. What organizational/work flow practice that I _have_ discussed do
23+
you think is unnecessary or impractical for daily data analytic
24+
tasks? Why? Keep in mind that practice doesn't have to include
25+
using R, but could instead mean using SPSS, Excel, _etc_. Also,
26+
it's not a trick or gotcha question! I want your (well considered)
27+
thoughts.
28+
1. What organizational/work flow practice have I _not_ included that
29+
you think would help reduce error or improve reproducibility? Why?
30+
31+
#### Submission details
32+
33+
- Save your script (`<lastname>_assignment_10.md` or
34+
`<lastname>_assignment_10.Rmd`) in your `scripts` directory.
35+
- Push changes to your repo (the new script and new folder) to GitHub
36+
prior to the next class session.
37+
38+
39+
40+
41+

_assignments_hold/assignment_4.Rmd

+59
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
---
2+
layout: lesson
3+
title: Assignment 4
4+
subtitle: EDH7916
5+
author: Benjamin Skinner
6+
order: 4
7+
category: problemset
8+
links:
9+
pdf: assignment_4.pdf
10+
output:
11+
md_document:
12+
variant: gfm
13+
preserve_yaml: true
14+
---
15+
16+
Using the `hsls_small.csv` data set and the online codebook, answer
17+
the following questions. You **do not** need to save the final output
18+
as a data file: just having the final result print to the console is
19+
fine. For each question, I would like you to try to pipe all the
20+
commands together. Throughout, you **should** account for missing values by
21+
dropping them.
22+
23+
For each question, show your data work and, if necessary, answer the
24+
question in a short (1-2 sentence(s)) comment.
25+
26+
## Questions
27+
28+
1. Compute the average test score by region and join back into the
29+
full data frame. Next, compute the difference between each
30+
student's test score and that of the region. Finally, return the
31+
mean of these differences by region.
32+
1. Compute the average test score by region and family income
33+
level. Join back to the full data frame. **HINT** You can join on
34+
more than one key.
35+
1. Select the following variables from the full data set:
36+
- `stu_id`
37+
- `x1stuedexpct`
38+
- `x1paredexpct`
39+
- `x4evratndclg`
40+
41+
From this reduced data frame, reshape the data frame so that it is
42+
long in educational expectations, meaning that each observation
43+
should have two rows, one for each educational expectation type.
44+
45+
_e.g. (your column names and values may be different)_
46+
47+
| stu_id | expect\_type | expectation | x4evratndclg |
48+
|:------:|:------------:|:-----------:|:------------:|
49+
| 0001 | x1stuedexpct | 6 | 1 |
50+
| 0001 | x1paredexpct | 7 | 1 |
51+
| 0002 | x1stuedexpct | 5 | 1 |
52+
| 0002 | x1paredexpct | 5 | 1 |
53+
54+
#### Submission details
55+
56+
- Save your script (`<lastname>_assignment_4.R`) in your `scripts`
57+
directory.
58+
- Push changes to your repo (the new script and new folder) to GitHub
59+
prior to the next class session.

_assignments_hold/assignment_5.md

-53
This file was deleted.

_assignments_hold/assignment_9.Rmd

+34-22
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ title: Assignment 9
44
subtitle: EDH7916
55
author: Benjamin Skinner
66
order: 9
7-
category: problemset
7+
category: supplemental
88
links:
99
pdf: assignment_9.pdf
1010
output:
@@ -13,29 +13,41 @@ output:
1313
preserve_yaml: true
1414
---
1515

16-
I have been opinionated throughout this course (and in lesson 10 in
17-
particular) about the best ways to organize a quantitative data
18-
workflow. Considering all of that, please answer the following two
19-
questions in a Markdown (`*.md`) or RMarkdown (`*.Rmd`) file, giving
20-
about half of a page to each answer:
21-
22-
1. What organizational/work flow practice that I _have_ discussed do
23-
you think is unnecessary or impractical for daily data analytic
24-
tasks? Why? Keep in mind that practice doesn't have to include
25-
using R, but could instead mean using SPSS, Excel, _etc_. Also,
26-
it's not a trick or gotcha question! I want your (well considered)
27-
thoughts.
28-
1. What organizational/work flow practice have I _not_ included that
29-
you think would help reduce error or improve reproducibility? Why?
30-
16+
Using the `hsls_small.csv` data set and the online code book, answer
17+
the following questions. You **do not** need to save the final output
18+
as a data file: just having the final result print to the console is
19+
fine. For each question, **you must answer using base R commands (no
20+
tidyverse)**. You can account for missing values by dropping them.
21+
22+
For each question, show your data work and then answer the question in
23+
a short (1-2 sentence(s)) comment. (**NOTE:** If you also completed
24+
assignment 3, your written answers can be similar to what you wrote before.)
25+
26+
## Questions
27+
28+
1. What is the average standardized math test score?
29+
1. What is the average standardized math test score by gender?
30+
1. In what year and month were the oldest students in the data set
31+
born? The youngest?
32+
1. Among those students who are under 185% of the federal poverty line
33+
in the base year of the survey, what is the median household income
34+
(give the category and what that category reprents).
35+
1. Of the students who earned a high school credential (diploma or
36+
GED), what percentage earned a GED or equivalency? How does this
37+
differ by region?
38+
1. What percentage of students ever attended a postsecondary
39+
institution by February 2016? Give the cross tabulations for:
40+
- family incomes less than or equal to $35,000 and greater than
41+
$35,000
42+
- region
43+
44+
This means you should have percentages for 8 groups: above/below
45+
$35k within each region.
46+
3147
#### Submission details
3248

33-
- Save your script (`<lastname>_assignment_10.md` or
34-
`<lastname>_assignment_10.Rmd`) in your `scripts` directory.
49+
- Save your script (`<lastname>_assignment_9.R`) in your `scripts`
50+
directory.
3551
- Push changes to your repo (the new script and new folder) to GitHub
3652
prior to the next class session.
3753

38-
39-
40-
41-

_assignments_hold/supplemental_assignment_2.Rmd

-53
This file was deleted.
File renamed without changes.

0 commit comments

Comments
 (0)