Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

global base path for shed_upload include relative paths #160

Closed
peterjc opened this issue Apr 30, 2015 · 13 comments
Closed

global base path for shed_upload include relative paths #160

peterjc opened this issue Apr 30, 2015 · 13 comments

Comments

@peterjc
Copy link
Contributor

peterjc commented Apr 30, 2015

I'm trying to replace a manual process for preparing tar-balls to push to the Tool Shed, however by using a common top level test-data/ (and tool-data/) folder this is harder than expected.

Current manual process as documented in this tool's README file:

$ tar -czf venn_list.tar.gz tools/venn_list/README.rst tools/venn_list/venn_list.* tools/venn_list/tool_dependencies.xml test-data/magic.pdf test-data/venn_list.tabular test-data/rhodopsin_proteins.fasta
$ tar -tzf venn_list.tar.gz
tools/venn_list/README.rst
tools/venn_list/venn_list.py
tools/venn_list/venn_list.xml
tools/venn_list/tool_dependencies.xml
test-data/magic.pdf
test-data/venn_list.tabular
test-data/rhodopsin_proteins.fasta

I have been able to get the desired output (ignoring the order, #159) with a complicated .shed.yml entry like this:

include:
- source: README.rst
  destination: tools/venn_list/README.rst
- source: venn_list.py
  destination: tools/venn_list/venn_list.py
- source: venn_list.xml
  destination: tools/venn_list/venn_list.xml
- source: tool_dependencies.xml
  destination: tools/venn_list/tool_dependencies.xml
- source: ../../test-data/magic.pdf
  strip_components: 2
- source: ../../test-data/venn_list.tabular
  strip_components: 2
- source: ../../test-data/rhodopsin_proteins.fasta
  strip_components: 2

Giving:

$ planemo shed_upload --tar_only  ~/repositories/pico_galaxy/tools/venn_list 
cp /tmp/tmpybJqwL shed_upload.tar.gz
(.venv)[galaxy@ppserver planemo]$ tar -tzf /tmp/tmpybJqwL
tools/
tools/venn_list/
tools/venn_list/venn_list.py
tools/venn_list/tool_dependencies.xml
tools/venn_list/README.rst
tools/venn_list/venn_list.xml
test-data/
test-data/rhodopsin_proteins.fasta
test-data/magic.pdf
test-data/venn_list.tabular

I would like to use something along these lines in the .shed.yml file:

#All include paths from two levels up (i.e. prefix with ../../ to find the file)
include_offset: 2
include:
- tools/venn_list/README.rst
- tools/venn_list/venn_list.py
- tools/venn_list/venn_list.xml
- tools/venn_list/tool_dependencies.xml
- test-data/magic.pdf
- test-data/venn_list.tabular
- test-data/rhodopsin_proteins.fasta
@hexylena
Copy link
Member

Wouldn't it just be easier to symlink test-data?

@peterjc
Copy link
Contributor Author

peterjc commented Apr 30, 2015

So have ~/repositories/pico_galaxy/tools/venn_list/test-data pointed at ~/repositories/pico_galaxy/test-data,

$ cd ~/repositories/pico_galaxy/tools/venn_list/
$ ln -s ../../test-data/ test-data

And then use something like this in .shed.yml,

include:
- README.rst
- venn_list.py
- venn_list.xml
- tool_dependencies.xml
- test-data/magic.pdf
- test-data/venn_list.tabular
- test-data/rhodopsin_proteins.fasta

That works:

$ planemo shed_upload --tar_only  ~/repositories/pico_galaxy/tools/venn_list 
cp /tmp/tmp3GYrUC shed_upload.tar.gz
$ tar -xvf /tmp/tmp3GYrUC
venn_list.py
tool_dependencies.xml
README.rst
test-data/
test-data/rhodopsin_proteins.fasta
test-data/magic.pdf
test-data/venn_list.tabular
venn_list.xml

This means a one-off upheaval in the tool history on the Tool Shed (the main files would move, e.g. tools/venn_list/venn_list.xml --> venn_list.xml), and for me (making the symlinks).

Essentially this means pushing the directory structure outlined here https://github.com/galaxy-iuc/standards/blob/master/docs/best_practices/repositories.rst (over the approach I'm currently using based on how things were setup back before the Tool Shed existed).

I don't like the symlink idea though; but I do have a lot of common test files used in multiple Tool Shed repositories (e.g. test-data/rhodopsin_proteins.fasta and test-data/magic.pdf in this example).

@jmchilton
Copy link
Member

I guess @erasche was advocating linking in just the test-data needed (there will be an option in planemo to auto-infer the test-data at some point so it would be moot either way).

I like the new structure - Greg and Dan always advocated one tool per repository and I think planemo is getting close to making that... well less say a lot less painful for developers. In that world view - I am not sure a tools directory makes a lot of sense. I feel like each repository should have one primary artifact - packages a tool_dependencies.xml, suites repository_dependencies.xml and a Tool XML file for a tool directory - all in the root.

That said - I don't want planemo to force a world view - just sort of subtlely hint at it. Would globs have simplified your original include statements?

include:
- source: *
  destination: tools/venn_list/
test-data:
  include:
  - ../../test-data/magic.pdf
  - ../../test-data/venn_list.tabular
  - ../../test-data/rhodopsin_proteins.fasta

The glob above should work (let me know if it doesn't) - test-data directive doesn't exist yet - but makes sense to add it to me.

@jmchilton
Copy link
Member

Wait - the tool shed will pick up tools/venn_list/tool_dependencies.xml? Does it just search the whole repository for a file named tool_dependencies.xml? I didn't realize that - planemo assumes it is in . for shed linting purposes.

@hexylena
Copy link
Member

Essentially this means pushing the directory structure outlined here https://github.com/galaxy-iuc/standards/blob/master/docs/best_practices/repositories.rst (over the approach I'm currently using based on how things were setup back before the Tool Shed existed).

Ah...yeah, I may be overly fond of that directory structure. Symlinking individual files rather than the whole folder is definitely an option. As with @jmchilton, I don't mean to force a particular worldview.

@peterjc
Copy link
Contributor Author

peterjc commented Apr 30, 2015

I'm not too worried about getting tools/venn_list/* with or without this path prefix.

My concern is ../../test-data/* and ../../tool-data/* from my common folders. Having these auto-detected via the <test> tags, and *.loc files via tool_data_table_conf.xml.sample would be an interesting alternative to the include manifest directive.

@jmchilton
Copy link
Member

Created two issues for test data and tool data specifically. Symoblic links I guess are a work around for custom repository structures - but being able to infer those automatically would be great for tool de-multiplexing as well.

@peterjc
Copy link
Contributor Author

peterjc commented Apr 30, 2015

Thanks John; #162 and #163 look very interesting.

There can be other misc files (e.g. helper scripts, tool data files, images for README.rst) but for the most part I would expect them to be within the same folder as .shed.yml, and so no problem for the include mechanism.

The only other example I have come up with is including a top level file like ../../LICENSE as LICENSE within the tar-ball.

@bgruening
Copy link
Member

I would like to add the static folder to the list: https://github.com/galaxyproject/tools-iuc/tree/master/tools/bedtools/static/images

@peterjc
Copy link
Contributor Author

peterjc commented Apr 30, 2015

@bgruening but that isn't a top level folder (above the .shed.yml file), so surely you can already do this with something simple like:

include
- static/*.png

Or do you mean another magic feature to parse the REASTME.rst and tool RST for images to be bundled? (like #162 and #163).

@bgruening
Copy link
Member

@peterjc yes, it would be nice if we can parse the RST and upload those images that are needed.

@jmchilton
Copy link
Member

Sounds like #185 and these other workarounds are sufficient. Thanks a ton @peterjc!

@peterjc
Copy link
Contributor Author

peterjc commented May 7, 2015

Yes, with #185 merged, this alternative mechanism is not needed, and can be closed. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants