-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME.Rmd
224 lines (175 loc) · 8.92 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
---
output: github_document
editor_options:
markdown:
wrap: 72
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
description <- read.dcf("DESCRIPTION")
version <- as.vector(description[, "Version"])
```
# datscience R-Package
<!-- badges: start -->
<center>
[](https://github.com/Buedenbender/datscience/actions)
[](https://app.codecov.io/gh/Buedenbender/datscience?branch=main)
[](https://www.repostatus.org/#active)
[](https://choosealicense.com/licenses/mit/)
[)`-yellowgreen.svg)](commits/main)
[](commits/main)
</center>
<!-- badges: end -->
The datscience (**dat**aanalysis and **science**) R-package contains functions,
which are frequently required in the process of preparing data for publication.
Among those are:
* `format_flextable()` applies the APA 7th theme to a flextable object
* `apa_factorLoadings()` creates APA 7th based factor-loadings table
* `flex_table1()` tabular group comparison of 2 or more groups (usualy table 1)
* `Rcitation_appendix()` Function that allows the creation of a full APA-style
table with all citations and informations on the R-packages utilized.
The overall goal was to improve the workflow of data analysis and
help with formatting challenges I encountered in the preparation of submission
to scientific journals. For example: getting stats from R with the right format into MS Word.
### Installation
You can install the latest released version of datscience easily, directly from
[GitHub](https://github.com/Buedenbender/datscience#readme) with:
``` r
# Normal intsallation
install.packages("devtools")
devtools::install_github("Buedenbender/datscience")
```
My recommendation would be to use a the [pacman](https://github.com/trinker/pacman)
package manger instead, as this installs the latest version from github and directly loads it
```{r}
# Recommendation: pacman
pacman::p_load_gh("Buedenbender/datscience")
```
#### Installation Troubleshoot
Some users might encounter:
`(System Error 267 @win/processx.c:1040)`, which is due to special characters
in the username and the path to the directories (e.g., `ö` or `é`). In this case,
you can try to install `datscience` with the remotes package in standalone mode
in a fresh R-sessions (no packages loaded) (see below and for reference this [stackoverflow
posting](https://stackoverflow.com/q/68400661/7318488)).
``` r
Sys.setenv(R_REMOTES_STANDALONE="true")
remotes::install_github("Buedenbender/datscience")
```
### A Teaser of `datscience` Functionality: `flex_table1()`
While R provides so many opportunities and power to conduct whatever analyses one
can imagine, I found myself often having difficulties with the transfer of the
analysis or the results from R session into MS Word.
**Example Problem:** Get a *nicely* formatted (in accordance with APA 7th
publication manual) sociodemographic table 1 directly into a Word file
(\*.docx).
As of March 2022 the new function `datscience::flex_table1()` makes the creation
of the Sociodemograhpic Table 1 (including the statistical comparisons of subsamples), basically
a piece of cake. Take a look at the new article `vignette("flex_table1")`.
For the example we took the popular [iris]() dataset and included a simulated
categorial variable called *Color* that contains either "Blue" or "Orange".
We supply the function with a formula that determines which variables
are to be included in the table. For the example we include the
two metric variables *Sepal.Length*, *Sepal.Width* as well as the simulated
*Color*
```{r include=FALSE}
str_formula <- "~ Sepal.Length + Sepal.Width +test | Species"
iris_sim <- iris
iris_sim$Color <- factor(sample(c(rep("Orange", 30), rep("Blue", 120))))
```
```{r}
# To load datscience
pacman::p_load_gh("Buedenbender/datscience")
str_formula <- "~ Sepal.Length + Sepal.Width + Color | Species"
flex_table1(str_formula, data = iris_sim, overall = "Overall") # |>
# save_flextable("Table1.docx")
```
<center><img src = "man/figures/flex_table1.png"
centerImage width = "80%" /> </center>
Uncomment the pipe ` |> ` operator above and the line after the call to `flex_table1()`
to diretly save this nicely formatted tabular comparison as .docs (Word) document.
### Further Examples of `datscience` Functionality
Below are just a few examples of the functionality of the package
#### The `apa_corrTable()` Function
The `datscience::apa_corrTable()` displays correlations with marked significance
and additionally adds descriptive statistics to the table, *see below*:
**Screenshot of “CorrelationTable_iris.docx”**
<center><img src = "man/figures/README-apa_corrTableExample.png"
centerImage width = "95%" /> </center>
This function resolves around three other useful functions from this
package.
1. Creates the correlation table by calling
`datscience::corstars()`[<sup>[1]</sup>](https://rdrr.io/github/DominikVogel/vogelR/src/R/output.R)).
```{r}
datscience::corstars(iris[1:4])
```
2. Appends desired summary stats to the flextable.
3. Formatting of the `flextable::flextable()` object to APA 7th style,
by utilizing the `format_flextable()` function. To illustrate the function,
we here use it to display the first 5 rows of the iris data set.
```{r}
datscience::format_flextable(flextable::flextable(head(iris, 5)),
table_caption = c("Table 2", "Illustrating Functionality of format_flextable()")
)
```
<center><img src = "man/figures/README-format_flextableIris.png"
centerImage width = "65%" /> </center>
4. Utilizing the `datscience::save_flextable()` function. This will savely (i.e., prohibiting overwrite of files by serializing the naming) write the flextable object to a Word (.docx) file
#### The `format_flextable()` Function
One of the most utilized functions inside the package is the
`datscience::format_flextable()` which takes a `flextable` objects and applies
the APA 7th edition theme on it. It also provides a work-around to give an APA
ready table caption and a note.
**Note:** The code for the formatting (theme) for `format_flextable()` function was inspired
from the blog post of [Rémi
Thériault](https://rempsyc.remi-theriault.com/articles/table)
The flextable package is so versatile and it was exactly what I was looking for
to get nicely formatted tables directly from R(studio) into Word.
The same holds true for the `datscience::format_flextable()` function
from the datscience package. It just applies some repetitive formatting necessary
to convert a flextable to a "publication ready" APA formatted table.
One example of the flexibility would be to just try to print the factor loadings
from a principal component analysis (PCA, `psych::prinicial`)
**Let's first create an exemplary PCA**<sup>[[1](https://www.rdocumentation.org/packages/datasets/versions/3.6.2/topics/Harman74.cor),
[2](https://personality-project.org/r/psych/help/principal.html)]</sup>
**and extract the factor loadings**:
(Which is also more conveniently packaged in the function `apa_factorLoadings()` function)
```{r}
# Creation of an Example Prinicipal Component Analysis
pacman::p_load(psych, dplyr)
pc <- principal(Harman74.cor$cov, 4, rotate = "varimax")
pc_loadings <- pc$loadings |>
fa.sort() |>
round(3) |>
unclass() |>
as.data.frame() |>
mutate(across(
everything(),
~ if_else((. < 0.3), "", as.character(.))
)) |>
bind_cols(
Communality = pc$communality,
Uniqueness = pc$uniquenesses,
Complexity = pc$complexity
) |>
mutate(across(where(is.numeric), round, 2)) |>
tibble::rownames_to_column("items")
```
**Formatting these loadings to APA with just one function:**
```{r}
formatted_loadings <- datscience::format_flextable(flextable::flextable(pc_loadings),
table_caption = c("Table 3", "Factor Loadings in Exemplary PCA")
)
formatted_loadings
```
<center><img src = "man/figures/README-format_flextableExample.png"
centerImage width = "75%" /> </center>
### Related Work
[rempsyc](https://rempsyc.remi-theriault.com/) I highly recommend this now available package. It offers great utility for psychologist and social scientist getting into R and provides nicely on-point tutorials on how to implement the provided functions ([one](https://remi-theriault.com/blog_table.html) of which was the origin of `datscience::format_flextable()`)
<!-- # Testing Packagedown -->