Skip to content

Should the show() method for SummarizedExperiment objects suggest saveHDF5SummarizedExperiment()? #59

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
hpages opened this issue Oct 7, 2021 · 1 comment

Comments

@hpages
Copy link
Contributor

hpages commented Oct 7, 2021

Vince (@vjcitn) suggested:

One could imagine the show method for SummarizedExperiment checking to see if there is evidence of HDF5 in the assay and if it finds it, it adds a line 'To serialize, use saveHDF5SummarizedExperiment.' It may not be completely foolproof but it might help.

Motivated by https://community-bioc.slack.com/archives/C6KJHH0M9/p1633536980007800

Another approach that was suggested is to add save and saveRDS in BiocGenerics, and making them fail (advising use of special method) if handed an HDF5SummarizedExperiment derivate.

However there are several complications with this:

  1. There's no HDF5SummarizedExperiment class. These objects are just SummarizedExperiment objects or derivatives and dispatch cannot be used to distinguish between those that have on-disk data from those that have in-memory data.
  2. On-disk data could be present in any object, not just SummarizedExperiment objects. For example a GRanges object could have a TileDBMatrix object in its metadata columns.
  3. save() cannot easily be turned into a generic.
  4. There are situations where it's ok to call save() or saveRDS() on an object with on-disk data.

Feedback and suggestions are welcome.

@PeteHaitch
Copy link
Contributor

PeteHaitch commented Oct 7, 2021

FWIW in BSseq we have a modified show() that alerts the user that some of the assays are HDF5-backed (it doesn't handle other on-disk backends) but it doesn't provide specific advice about how to save/serialize the object.

suppressPackageStartupMessages(library(bsseq))
suppressWarnings(example("BSseq", "bsseq", echo = FALSE, verbose = FALSE))
#> Loading required package: DelayedArray
#> Loading required package: Matrix
#> 
#> Attaching package: 'Matrix'
#> The following object is masked from 'package:S4Vectors':
#> 
#>     expand
#> 
#> Attaching package: 'DelayedArray'
#> The following objects are masked from 'package:base':
#> 
#>     aperm, apply, rowsum, scale, sweep
#> Loading required package: rhdf5
#> 
#> Attaching package: 'HDF5Array'
#> The following object is masked from 'package:rhdf5':
#> 
#>     h5ls
hdf5_BS1
#> An object of type 'BSseq' with
#>   3 methylation loci
#>   3 samples
#> has not been smoothed
#> Some assays are HDF5Array-backed

Created on 2021-10-08 by the reprex package (v2.0.1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants