Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

glimpse errors on a tbl_df from jsonlite #103

Closed
jankatins opened this issue Jun 19, 2016 · 4 comments
Closed

glimpse errors on a tbl_df from jsonlite #103

jankatins opened this issue Jun 19, 2016 · 4 comments
Assignees

Comments

@jankatins
Copy link

jankatins commented Jun 19, 2016

library("dplyr")
library("tibble")
library("jsonlite")
download.file("http://jsonstudio.com/wp-content/uploads/2014/02/world_bank.zip", "world_bank.zip")
world_bank <- jsonlite::stream_in(unz("world_bank.zip", "world_bank.json"))
world_bank2 <- as_data_frame(world_bank)
world_bank2 %>% glimpse()

[tibble is compiled from master]

results in

Observations: 500
Variables: 50
Error in `[.data.frame`(X[[i]], ...): undefined columns selected
Traceback:

1. world_bank2 %>% glimpse()
2. withVisible(eval(quote(`_fseq`(`_lhs`)), env, env))
3. eval(quote(`_fseq`(`_lhs`)), env, env)
4. eval(expr, envir, enclos)
5. `_fseq`(`_lhs`)
6. freduce(value, `_function_list`)
7. withVisible(function_list[[k]](value))
8. function_list[[k]](value)
9. glimpse(.)
10. glimpse.tbl(.)
11. as.data.frame(head(x, rows))
12. head(x, rows)
13. head.data.frame(x, rows)
14. x[seq_len(n), , drop = FALSE]
15. `[.tbl_df`(x, seq_len(n), , drop = FALSE)
16. lapply(x, `[`, i)
17. FUN(X[[i]], ...)
18. `[.data.frame`(X[[i]], ...)
19. stop("undefined columns selected")

Interestingly, the original data.frame works with glimpse():

> world_bank %>% glimpse()
Observations: 500
Variables: 50
$ _id                      (data.frame) c("52b213b38594d8a2be17c780", "52b2...
$ approvalfy               (chr) "1999", "2015", "2014", "2014", "2014", "2...
$ board_approval_month     (chr) "November", "November", "November", "Octob...
$ boardapprovaldate        (chr) "2013-11-12T00:00:00Z", "2013-11-04T00:00:...
$ borrower                 (chr) "FEDERAL DEMOCRATIC REPUBLIC OF ETHIOPIA",...
$ closingdate              (chr) "2018-07-07T00:00:00Z", NA, NA, NA, "2019-...
$ country_namecode         (chr) "Federal Democratic Republic of Ethiopia!$...
$ countrycode              (chr) "ET", "TN", "TV", "RY", "LS", "KE", "IN", ...
$ countryname              (chr) "Federal Democratic Republic of Ethiopia",...
$ countryshortname         (chr) "Ethiopia", "Tunisia", "Tuvalu", "Yemen, R...
$ docty                    (chr) "Project Information Document,Indigenous P...
$ envassesmentcategorycode (chr) "C", "C", "B", "C", "B", "C", "A", "C", "B...
$ grantamt                 (int) 0, 4700000, 0, 1500000, 0, 0, 0, 27280000,...
$ ibrdcommamt              (int) 0, 0, 0, 0, 0, 0, 500000000, 0, 0, 2000000...
$ id                       (chr) "P129828", "P144674", "P145310", "P144665"...
$ idacommamt               (int) 130000000, 0, 6060000, 0, 13100000, 100000...
$ impagency                (chr) "MINISTRY OF EDUCATION", "MINISTRY OF FINA...
$ lendinginstr             (chr) "Investment Project Financing", "Specific ...
$ lendinginstrtype         (chr) "IN", "IN", "IN", "IN", "IN", "IN", "IN", ...
$ lendprojectcost          (dbl) 550000000, 5700000, 6060000, 1500000, 1500...
$ majorsector_percent      (list) Education, Education, Public Administrati...
$ mjsector_namecode        (list) Education, Education, Public Administrati...
$ mjtheme                  (list) Human development, Economic management, S...
$ mjtheme_namecode         (list) Human development, , 8, 11, Economic mana...
$ mjthemecode              (chr) "8,11", "1,6", "5,2,11,6", "7,7", "5,4", "...
$ prodline                 (chr) "PE", "RE", "PE", "RE", "PE", "PE", "PE", ...
$ prodlinetext             (chr) "IBRD/IDA", "Recipient Executed Activities...
$ productlinetype          (chr) "L", "L", "L", "L", "L", "L", "L", "L", "L...
$ project_abstract         (data.frame) c("The development objective of the...
$ project_name             (chr) "Ethiopia General Education Quality Improv...
$ projectdocs              (list) Project Information Document (PID),  Vol....
$ projectfinancialtype     (chr) "IDA", "OTHER", "IDA", "OTHER", "IDA", "ID...
$ projectstatusdisplay     (chr) "Active", "Active", "Active", "Active", "A...
$ regionname               (chr) "Africa", "Middle East and North Africa", ...
$ sector                   (list) Primary education, Secondary education, P...
$ sector1                  (data.frame) c("Primary education", "Public admi...
$ sector2                  (data.frame) c("Secondary education", "General p...
$ sector3                  (data.frame) c("Public administration- Other soc...
$ sector4                  (data.frame) c("Tertiary education", "NA", "NA",...
$ sector_namecode          (list) Primary education, Secondary education, P...
$ sectorcode               (chr) "ET,BS,ES,EP", "BZ,BS", "TI", "JB", "FH,YW...
$ source                   (chr) "IBRD", "IBRD", "IBRD", "IBRD", "IBRD", "I...
$ status                   (chr) "Active", "Active", "Active", "Active", "A...
$ supplementprojectflg     (chr) "N", "N", "Y", "N", "N", "Y", "N", "N", "N...
$ theme1                   (data.frame) c("Education for all", "Other econo...
$ theme_namecode           (list) Education for all, 65, Other economic man...
$ themecode                (chr) "65", "54,24", "52,81,25,47", "59,57", "41...
$ totalamt                 (int) 130000000, 0, 6060000, 0, 13100000, 100000...
$ totalcommamt             (int) 130000000, 4700000, 6060000, 1500000, 1310...
$ url                      (chr) "http://www.worldbank.org/projects/P129828...

Note that repr also has a problem with the produced tbl_df: IRkernel/repr#70

Note this might be related: tidyverse/dplyr#775 because if I read that issue right data.frames in tbl_df should be forbidden but in this case contains some?

@krlmlr
Copy link
Member

krlmlr commented Jun 20, 2016

as_data_frame() should probably be more concerned about the input it allows in the first place. For glimpse() the solution could be a safe fallback that at least doesn't throw or distort the output.

@jankatins
Copy link
Author

I didn't really find something in ?data.frame which told me that it is not allowed to have such stuff in there, only that the default data.frame(...) function would convert each argument to a data.frame and use each column. But nothing that each column must be a atomic vector. If that is a problem, then I should probably file an issue at the jsonlite repo...

@krlmlr krlmlr added the ready label Jun 20, 2016
@krlmlr
Copy link
Member

krlmlr commented Jun 20, 2016

Printing world_bank2 doesn't work either. The reason is that head.data.frame(), which is called by both, fails.

Nested data frames are valid return values for jsonlite, you'll have to use jsonlite::flatten for use in conjunction with tibbles. I'm adding a check to as_data_frame.data.frame(), which is already in place for as_data_frame.list().

@krlmlr krlmlr self-assigned this Jun 20, 2016
@krlmlr krlmlr added in progress and removed ready labels Jun 20, 2016
@krlmlr krlmlr closed this as completed in 5d45cf7 Jun 20, 2016
@github-actions
Copy link
Contributor

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 15, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants