Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replace iotools with vroom **WIP** #152

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

replace iotools with vroom **WIP** #152

wants to merge 1 commit into from

Conversation

jeffeaton
Copy link
Collaborator

In read_dhs_flat(), replace iotools::input.file() with vroom::vroom() for reading fixed-width text files.

vroom::vroom is much faster and memory efficient. Makes all parsing faster and improves ability to read very large datasets without exhausting system memory (i.e. India DHS surveys).

Also replaces Map() with for() loop when assigning variable labels to avoid full dataset copy in memory.

Work in progress

This PR works, but two remaining things to do:

  • vroom::vroom() is pretty chatty and throws lots of messages / warnings. Probably want to silence some of these.
  • Run a systematic download of all datasets to ensure no edge cases or unexpected issues.

In read_dhs_flat(), replace iotools::input.file() with vroom::vroom() for reading fixed-width text files.

vroom::vroom is much faster and memory efficient. Makes all parsing faster and improves ability to read very large datasets without exhausting system memory (i.e. India DHS surveys).

Also replaces Map() with for() loop when assigning variable labels to avoid full dataset copy in memory.
@jeffeaton jeffeaton marked this pull request as draft April 21, 2024 18:17
@jeffeaton jeffeaton requested a review from OJWatson April 21, 2024 18:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant