-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdata_project_presentation.qmd
256 lines (171 loc) · 5.67 KB
/
data_project_presentation.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
---
title: "FAOSTAT-package 3.0"
subtitle: "An overhaul of the API wrapper of the FAOSTAT API"
author: "Sebastian Campbell"
date: "2024-03-18"
date-format: long
search: false
format:
revealjs:
theme: simple
logo: img/combined_logo.png
css: lib/css/logo.css
editor: source
---
```{r setup, include = FALSE}
library(FAOSTAT)
```
# The core mission
To update the `FAOSTAT` package to allow R users to consume FAO's API data easily
#
To update the `FAOSTAT` package to allow **R** users to consume FAO's API data easily
:::{.notes}
We have a bit to unpack here
:::
#
To update the `FAOSTAT` package to allow R users to consume **FAO**'s API data easily
#
To update the `FAOSTAT` package to allow R users to consume FAO's **API** data easily
# What _is_ FAO?
Food and Agriculture Organization
{width=60%}{width=40%}
# FAO
Goal is to:
> ...achieve food security for all and make sure that people have regular access to enough high-quality food to lead active, healthy lives.
## ESS in FAO
> Produce up-to-date statistics
> Develop and promote international food and agricultural statistical standards, method and tools
> Work directly with countries to develop national statistical capacity
:::{.notes}
These have been paraphrased. So how can ESS disseminate statistics to the world of stakeholders?
:::
# What _is_ FAOSTAT?
- [FAOSTAT](https://www.fao.org/faostat) is the Food and Agriculture Organization's (FAO) tool for disseminating statistical data they produce
- Not all data
- Fisheries and Forestry are separate
## How do we get data from FAOSTAT?
- Web interface
- Data explorer
- CSV exporter
- **Bulk download**
- Web interface API
- FAOSTAT3 API
:::{.notes}
Bulk downloads were all that worked before we started
:::
## How do we get data from FAOSTAT?
- Web interface
- Data explorer
- CSV exporter
- Bulk download
- Web interface API
- ~~FAOSTAT3 API~~
:::{.notes}
This API doesn't work anymore
:::
# [The core problem](https://gitlab.com/paulrougieux/faostatpackage/-/issues/12)
{{< video vid/faostat_ping.mp4 >}}
# What _is_ an API?
- Application Programming Interface
- Allows different bit of software to communicate with each other
- Similar to how a steering wheel lets you drive anything
:::{.notes}
Slow down and explain this properly. When you drive, you don't need to think about injecting fuel or know anything about the differential. With an automatic, you don't even need to know about gears. And you can get in any vehicle, diesel, electric, hovercraft and the same ideas apply. This is the magic of abstraction and presenting an interface divorced from the actual functionality.
:::
# REST APIs
- HTTP-based
- Use mainly GET and POST
# REST example
**GET** - Request data from the server
:::{.incremental}
* Request: `GET https://example.com/johnsmith/info`
* Payload: None
* Response: `{username: johnsmith, full_name = "John Smith"}`
:::
# What is the FAOSTAT package?
- API wrapper that allows R users to use FAO functions
- Allows users to pull in FAOSTAT data
## Why do R users need a package?
- Easily accessible documentation
- No need to convert json to R objects (tables)
## Quick demo of pre-existing functionality
```{r, echo = TRUE, cache = TRUE}
landuse <- get_faostat_bulk("RL")
head(landuse)
```
## Custodianship
- Developed by FAO employees
- Currently maintained by Paul Rougieux at the European Commission
## Existing documentation
- A single json file
- A word document
- Only covers a subset of functionality
## Challenges
- Old API doesn't work
- New API is largely undocumented
- Outdated functions
## We need to get from here
{.r-stretch}
## To here
{.r-stretch}
# FAOSTAT 2.3.0
- Completed in March 2023 with 3 goals:
- Fix core functions
- Triage existing functions
- Describe all API endpoints
## [Triage existing functions](https://gitlab.com/paulrougieux/faostatpackage/-/issues/20)
{.r-stretch}
## Describe all API endpoints
- Used json documentation
- Manually tested all the endpoints
- Wrote everything up on an issue page
## [It was a lot of work](https://gitlab.com/paulrougieux/faostatpackage/-/issues/23) {.scrollable}

## API structure
```{mermaid}
flowchart TD
Group --> Domain
Dimension --> Codes
Dimension --> Subdimensions
Subdimensions --> Codes
Domain --> Dimension
Domain --> Data
Domain --> BD[Bulk downloads]
Domain --> Metadata
Metadata --> Document
```
# FAOSTAT 3.0.0
Implementing all of the changes we identified in 2.3.0
- Deprecating old functions
- Creating new structures
## Now it works {.scrollable}
```{r, echo = TRUE}
read_fao(domain = "RL",
area_codes = "8",
element_codes = "5110",
item_codes = "6620",
year_codes = 2000:2020)
```
:::{.notes}
Compare this to how weak it was before
:::
## Defunct functions
```{r defunct, echo = TRUE, error = TRUE}
getWDI()
```
:::{.notes}
Backwards compatibility, you can still access the original help and code, but it's out of the way. Use help("getWDI-defunct") for other stuff
:::
#
:::{.r-fit-text}
[Shiny app!](https://pascalian.shinyapps.io/data_quality/)
:::
:::{.notes}
This is based on the bulk data, because I want to show a lot of data, but it could easily be based on live
:::
# Future work
- Does it make sense to make a package for every API?
- Web-connectors are another approach
- Collecting AQUASTAT, Fisheries and other FAO data into one place would allow a single package to work for more data
# Further reading
- [The companion report to this presentation](https://sebastian-c.github.io/uls-data-project/data_project.html)