-
Types and distinctions
- Formal Documents: Journal articles, books, book chapters, theses, consulting reports, etc.
- Informal documents: preliminary analyses, statistical homework,
- Online content: web pages, blog posts, forum posts
- Browser metaphor versus page/slide-based metaphor
-
Context
- When to use reproducible analysis?
- When to use knitr with R Markdown or LaTeX?
-
Reproducibility varies on a continuum
-
One particular form:
- code transforms raw data and meta-data into processed data,
- code runs analyses on the data, and
- code incorporates analyses into a report
-
Ideally, the process involves a one-click build
-
Public sharing of document, code, and data is optional, but forms part of gold standard of scientific openness
-
Goes by many names, particularly "reproducible research", but I prefer "reproducible analysis".
\tiny{ See also: \url{http://stats.stackexchange.com/a/15006/183} \url{https://github.com/jeromyanglim/rmarkdown-rmeetup-2012/issues/11}}
- Ability to reproduce analysis
- Increase accuracy
- Ability to verify analyses are consistent with intentions
- Ability to review analysis choices
- Increase clarity of communication
- Increased trustworthiness
- Increased accuracy +
- Ability for others to verify
- Extensibility
- Ability to easily modify or re-use existing analyses
- Combine R and plain text file format to produce documents (e.g., pdfs, HTML documents, etc.)
- Sweave
- brew
- knitr
\tiny{see also \url{http://cran.r-project.org/web/views/ReproducibleResearch.html}}
- R: http://www.r-project.org/
- R Studio: http://rstudio.org/
- In R:
install.packages("knitr)
install.packages("markdown")
install.packages("xtable")
install.packages("ggplot2")
install.packages("lattice")
- pandoc:
- LaTeX distribution:
- E.g., TeXLive, MikTeX http://www.latex-project.org/ftp.html
- Simple, readable, intuitive, light-weight markup
- Convert to HTML
- Raw HTML can be interspersed to add functionality
- Various extensions and flaours of markdown
- Popular on websites: e.g., StackOverflow, GitHub, Reddit
\tiny{see also: \url{http://daringfireball.net/projects/markdown/ }}
\includegraphics[width=4in]{figures/headings.png}
\includegraphics[width=4in]{figures/basic-formatting.png}
\includegraphics[width=4in]{figures/paragraphs.png}
\includegraphics[width=4in]{figures/dot-points.png}
\includegraphics[width=4in]{figures/equations.png}
- Uses MathJaX to render LaTeX (and other) equations
- Inserts MathJaX script reference into HTML header
\tiny{getting started: \url{http://jeromyanglim.blogspot.com.au/2010/10/getting-started-with-writing.html}}
\includegraphics[width=4in]{figures/links.png}
\includegraphics[width=4in]{figures/images.png}
\includegraphics[width=4in]{figures/code.png}
\includegraphics[width=4in]{figures/quote.png}
\includegraphics[width=4in]{figures/tables.png}
\includegraphics[width=4in]{figures/html.png}
- knitr: R Package developed by Yihui Xie for weaving R (and other languages) with various markup languages
- R Markdown: A file format that combines R code chunks and markdown text which is converted by knitr into markdown, and other formats (e.g., HTML, pdf, etc.).
- R Studio: Open source, cross-platform IDE for R.
- knitr supports many markups: LaTeX, Markdown, HTML, reStructuredText
- knitr has really nice defaults
- Tidy placement of generated files
- Simplified figure production
- automatically print ggplot2 and lattice figures
- print figures by default
- permit interspersing of figures and console output
- Greater extensibility:
- output options
- supports languages other than R
- Simplified caching
- And more: http://yihui.name/slides/2012-knitr-RStudio.html
-
Benefits of Rstudio as IDE for R
- Open source
- Works on Linux, Mac, and Windows
- Many useful features
- It just works
- Tight integration with knitr
-
But many other options
- Emacs with ESS
- Vim with R plugin
- Eclipse with StatET
- etc.
- Introduction to R Markdown
- Statistics homework example
- Analysis of Winter Olympic Medals Example
\includegraphics[width=3in]{figures/rstudio-screenshot.png}
see http://yihui.name/knitr/options
```{r my_chunk_name, some_option='some_value'}
some_r_code
```
`r opts_chunk$set(opt = value)` # general form
`r opts_chunk$set(cache=TRUE)` # e.g, global cache
- Hide console input:
echo=FALSE
- Hide assorted messages:
warning=FALSE, error=FALSE, message=FALSE
- Hide console output:
results="hide"
- Display console input as is:
tidy=FALSE
- Output raw markup:
results="asis"
R Markdown `r 2 + 2` `r I(2+2)`
Markdown `4` 4
HTML <code>4</code> 4
- Support for multiple figures in a code block
- also see e.g.,
par(mfrow=c(2,2))
orgrid.arrange
- also see e.g.,
- Figures and console output can be interspersed in a code chunk
- Various code chunk options
- see http://yihui.name/knitr/options
fig.width
andfig.height
dev
defaults to pdf for LaTeX and png for HTML/markdown
- Many options for creating HTML Tables:
- R packages:
xtable
,googleVis
,R2HTML
,hwriter
- markdown extentions: github, pandoc
- Custom R code
- R packages:
xtable
is a reasonable option- For informal reports just use console output
- css can be added later to control table appearance
- If you require sophisticated tables, you may want to switch to LaTeX
print(xtable(my_data_frame, caption = "My Caption",
digits = 3), type = "html",
caption.placement = "top",
html.table.attributes =
"style=\"border: 1px solid black;\"")
\centerline{\includegraphics[height=1.5in]{figures/simple_table.png}}
Basic workflow:
- If knitting is quick, don't cache.
- If knitting takes more than ten seconds add \texttt{`}
r opts_chunk$set(cache=TRUE)
\texttt{`} to the top of R Markdown file. - If caching is causing problems, delete contents of
cache
folder, - But if caching is causing problems and knitting takes a long time, name R code
chunks and use the
dependson
option in knitr (see http://yihui.name/knitr/options). Naming also permits selective deletion of named R code chunks in the cache directory.
- Maintained by Jeffrey Horner; Developed by devloped JJ Allaire, Jeffrey Horner, Vicent Marti, and Natacha Porte
- R Package that creates more options for converting Markdown to HTML
markdownToHTML("file.md", "file.html", options=c(...))
- The default options are
"hard_wrap", "use_xhml", "smartypants", "base64_images"
require(knitr) # for knitting from rmd to md
require(markdown) # for md to html
knit('test.rmd', 'test.md') # creates md
markdownToHTML('test.md', 'test.html') # create html
browseURL(paste('file://',
file.path(getwd(),'test.html'),
sep='')) # open file in browser
see ?markdownHTMLOptions
for more options. E.g.,
markdownToHTML('test.md', 'test.html',
options='fragment_only')
- pandoc is a library and command-line tool for converting between many document formats (e.g., HTML, markdown, pdf, LaTeX, docx; also supports multiple plain text slide formats such as beamer)
- Lots of options
- Often requires thought in order to minimise conversion issues
pandoc -s file.html -o file.pdf
- For simple documents, click
knit to HTML
in RStudio - For complex documents use a command-line option:
- e.g.,
makefile
,Rscript
, etc. - combine with
pandoc
,knitr
options,markdown
options, text manipulation tools (e.g., sed, awk, scripting languages) etc. to flexibly produce a varity of documents
- e.g.,
If time permits, show example of knitr with LaTeX
-
knitr and R Markdown
- It makes reproducible analysis as simple as one click
- Great tool for:
- quick analyses for self and colleagues
- doing homework
- creating teaching resources
- blog posts, websites, etc.
- Scope to make more complex documents, but at a certain point it may be worth exploring other tools
-
knitr and R LaTeX
- Great for journal articles, theses, books (e.g., citations, cross-references, printed works, equations)
-
As your needs get more complex
- pandoc, makefiles, knitr options, markdown package options, scripts, etc.
- knitr: http://yihui.name/knitr/
- R Studio: http://rstudio.org/
- R Markdown with R Studio: http://rstudio.org/docs/authoring/using_markdown
- My Posts http://jeromyanglim.blogspot.com.au/search/label/reproducible%20research
- R on StackOverflow: http://stackoverflow.com/questions/tagged/r
- LaTeX: http://tex.stackexchange.com/
- knitr: https://github.com/yihui/knitr/issues
\begin{center} \LARGE{Questions?} \end{center}