Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remote.nextstrain_dot_org: Embed images into narratives on upload #235

Merged
merged 4 commits into from
Dec 13, 2022

Conversation

tsibley
Copy link
Member

@tsibley tsibley commented Nov 19, 2022

This permits local authoring of narratives that reference images on the local filesystem without manual conversion (to remote images or embedded images) before upload.

Unfortunately, both pyright and mypy (to a lesser degree) have enough issues type checking the pyparsing API (and thus our usage of it), that it is more tenable to skip their type checking of the new nextstrain.cli.markdown module than to work around all their issues. Hopefully this will be better in the future! It's possible some of the issues lay with the type annotations in pyparsing itself, and thus are more easily improved.

A small number of new tests help ensure the parsing and embedding of images works as expected, though more should be added with time as we run across additional real-world test cases. The framework established here makes that as quick as adding another Markdown file to the test data.

CI issues

Both CI issues appear unrelated to this PR, so I'll likely fix them first in a separate PR.

Testing

@tsibley tsibley requested a review from a team November 19, 2022 01:03
@tsibley
Copy link
Member Author

tsibley commented Nov 19, 2022

Immediate 3.6 failure looks spurious. I'll re-run that job once the rest of the run completes.

@tsibley tsibley marked this pull request as draft November 19, 2022 01:08
@jameshadfield
Copy link
Member

jameshadfield commented Nov 20, 2022

Big fan of this idea. Local narrative editing (nextstrain view) has a couple of big gotchas which the new narratives editor/debugger tries to solve: (i) relative images and (ii) local vs nextstrain.org URLs. This PR solves (i), but (ii) remains; the /edit/narratives debugger solves (ii) but doesn't support (i).

On the /edit/narratives side, one could imagine using this code to drag&drop a markdown + the image files which are referenced in the markdown (but it'd have to be a JS version and thus quite different). When editing functionality is actually available you could drag on image files and have them base64 embedded by the browser (again, not possible to use the CLI code here).

For the local editing approach it's be nice to work out where we want to end up, and how it interfaces with /edit/narratives (if at all). Tom - do you envision the CLI addressing the URL gotchas (ii)? (I took the approach of no longer recommending local narratives editing at all.)

tsibley added a commit to nextstrain/auspice that referenced this pull request Nov 22, 2022
…ides

Fixes handling of intra-document references like links and images that
reference a definition found elsewhere in the file, e.g. before the
first slide or on a different slide than the link/image reference.  Such
features are soon to be used for embedding images (see related PR
below).

Anyone previewing a document with such features outside of Auspice's
narrative rendering would see the link/image references work, but they'd
be broken when viewed thru Auspice.  Since narrative documents are
intended to gracefully degrade with conventional Markdown rendering and
still be a viewable document (e.g. on GitHub), it's nice to address this
issue for that reason alone.

It's also nice to use the DOM as a structured way to split apart the
document into slides, rather than relying on string manipulation and
parsing.  An alternative approach would be to use a Markdown parser that
emits an AST or stream of nodes as an intermediate we could split into
slides while still preserving references/definitions, but that seemed
like a bigger change (and I'd have to go survey the parsers available
instead of using the browser's built-in DOM).

While the entire internals of parseNarrativeBody() change, the shape of
the output "slide objects" is the same.  The one minor exception is that
slides will always have a "mainDisplayMarkdown" property (though it will
usually be the empty string) even if they don't have a corresponding
code block in the source text.  All downstream usages I found condition
not on property existence but on truthiness of the value, so this should
result in no functional change.

Resolves: <#1599>
Related-to: <nextstrain/cli#235>
@tsibley
Copy link
Member Author

tsibley commented Nov 22, 2022

This work is pretty focused on the "upload a narrative to share it" part of the process and doesn't really consider the earlier iterative part of local authoring (e.g. write some slides, preview, rinse & repeat). There's obviously a lot more to do to improve local authoring!

On the /edit/narratives side, one could imagine using this code to drag&drop a markdown + the image files which are referenced in the markdown (but it'd have to be a JS version and thus quite different).

Do you have any guess at how useful this would be in practice? It seems like something that's super error prone (e.g. easy to miss dragging on an image file; debugging is disconnected from uploading, so you might get it right when debugging but not when uploading, etc.) and maybe only serves as a stop gap measure until full authoring in the browser?

(Handling image embedding in a JS client would necessarily be different, yes. That said, I think much of the approach (modifying a stream of nodes, the node structures, etc) could be more or less directly ported from Python to JS with the biggest differences being how the stream of parsed nodes is produced and how the nodes are turned back into Markdown (or directly converted to HTML).)

When editing functionality is actually available you could drag on image files and have them base64 embedded by the browser (again, not possible to use the CLI code here).

To me, this sort of authoring in the browser seems ideal from the user's POV and also our implementation POV. Compared to tracking down image references after the fact, it's much simpler to handle a dropped image and embed it appropriately when serializing the editor's document model to Markdown.

So IMO better to spend effort on making authoring in the browser a reality rather than the stop gap above.

For the local editing approach it's be nice to work out where we want to end up, and how it interfaces with /edit/narratives (if at all). Tom - do you envision the CLI addressing the URL gotchas (ii)? (I took the approach of no longer recommending local narratives editing at all.)

I'm not sure how the CLI would address (ii) without it getting overly complex to use as it requires more information than the narrative itself provides (e.g. the CLI would have to know both the remote nextstrain.org locations of and the local filesystem locations of the narrative and datasets used). If we could, then it seems worthwhile, but (ii) feels more like an Auspice issue and question of how narratives are designed (e.g. nextstrain/auspice#890).

@jameshadfield
Copy link
Member

This work is pretty focused on the "upload a narrative to share it" part of the process and doesn't really consider the earlier iterative part of local authoring

Yup - and my comments are not a critique of the code in this PR (which seems great) but rather questioning how this fits into the bigger picture of narrative authoring. I'm of the opinion that we should aim to move completely away from developing narratives locally (i.e. avoid using auspice / nextstrain view).

Do you have any guess at how useful this [dragging on markdown + relative images] would be in practice? It seems like something that's super error prone

Agree with this too - and the workflow of dropping images onto individual slides (which are then converted to base64) is the way to go for the editor. But this leads to asking what the use case is for relative markdown images? They are (and probably will be) incompatible with /edit/narratives and, as far as I can figure, don't work within auspice either [1].

[1] E.g. a image ![tree-local](./tree.jpg) results in a request for localhost:4000/narratives/test/tree.jpg (actual pathname depends on narrative file name) which is a 302 redirect to /. Perhaps there is a particular location within ./dist you could put the file to make it work, but it's not obvious!

@tsibley
Copy link
Member Author

tsibley commented Nov 22, 2022

but rather questioning how this fits into the bigger picture of narrative authoring. I'm of the opinion that we should aim to move completely away from developing narratives locally (i.e. avoid using auspice / nextstrain view).

Yeah, totally fair questions! :-)

I'm not so sure about moving entirely away from local narrative authoring. It seems important to keep the ability to do it offline and without sharing via nextstrain.org, e.g. both for folks with spotty connectivity and folks who don't want to/can't host their narratives on nextstrain.org. Even if Auspice gains an in-browser editor for narratives (that we then leverage on nextstrain.org for saving to Groups, etc.), I imagine it would be useful to run that editor locally against local files, i.e. similar to what nextstrain view is already doing (though we might use a different command name for editing).

Stepping back a bit, and also getting to the question of local datasets for local narratives, my general thought here is that Nextstrain CLI could provide the web API needed by Auspice to view (and edit) narratives and datasets. (Right now that means the Charon API, but I imagine it might also mean parts of the RESTful API too in the future.) So instead of Auspice making requests to its own built-in server backed by the filesystem, Nextstrain CLI would run Auspice so that it makes requests to a server that Nextstrain CLI also provides. This server could be backed by the local filesystem when appropriate but could also back itself by nextstrain.org (incl. as the user logged into the CLI).

But this leads to asking what the use case is for relative markdown images? They are (and probably will be) incompatible with /edit/narratives and, as far as I can figure, don't work within auspice either [1].

The use case I saw was to edit a narrative in your $EDITOR (or on GitHub's web editor) and previewing it in that same editor (or on GitHub). Any old Markdown renderer will be able to show the "narrative" part of each slide, and clicking the header links (to nextstrain.org or a local Auspice) shows the dataset view for that slide. This works well for an authoring process where the authoritative source for your narratives (incl. any images) are stored in a Git repository, for example.¹ With this PR then, when you upload to nextstrain.org to share it more widely, the images come along for the ride instead of breaking.

But maybe I've just imagined this use case and it isn't actually useful??

¹ This is also the process I'd like to implement for https://github.com/nextstrain/narratives instead of having nextstrain.org source directly from GitHub.

@huddlej
Copy link
Contributor

huddlej commented Nov 22, 2022

Jumping into the broader discussion:

I'm not so sure about moving entirely away from local narrative authoring. It seems important to keep the ability to do it offline and without sharing via nextstrain.org, e.g. both for folks with spotty connectivity and folks who don't want to/can't host their narratives on nextstrain.org.

This point is consistent with the recent conversations we've had with users at office hours where folks have wanted or needed to develop their narratives in private (ahead of a manuscript submission or to learn how narratives work in private). But, I also see why @jameshadfield would prefer to move away from this functionality since the local vs. remote naming is a complicated problem. If we think about the use cases in terms of preference for public/private narratives and remote/local authoring, I wonder if most users would be ok with remote authoring if they knew it could be private? Poor internet connectivity would still be an unavoidable problem with the remote authoring, though.

So instead of Auspice making requests to its own built-in server backed by the filesystem, Nextstrain CLI would run Auspice so that it makes requests to a server that Nextstrain CLI also provides.

This seems like a huge step for the CLI! It sounds interesting, although I don't remember discussing this before in our long-term plans for the CLI. Would you be open to talking more about this direction at our next Nextstrain meeting or just for a smaller meeting of people who are interested, @tsibley?

This works well for an authoring process where the authoritative source for your narratives (incl. any images) are stored in a Git repository, for example.¹ With this PR then, when you upload to nextstrain.org to share it more widely, the images come along for the ride instead of breaking.

I don't know if you've imagined this use case, although my personal experience is that I either work exclusively on a public narrative through GitHub or on a public/private narrative through Nextstrain Groups, but I never manage a narrative in both a GitHub repository and a group. This PR would make the second type of authoring (which I do most often) much easier, avoiding the need to post images somewhere publicly. I can see why versioning plus groups storage would be valuable, but for better or worse, I treat nextstrain remote upload like I do the "Save" button in Word; I want to iterate quickly through minor edits without making separate commits.

@tsibley tsibley mentioned this pull request Nov 22, 2022
1 task
tsibley added a commit that referenced this pull request Nov 22, 2022
We request "ubuntu-latest" for our CI jobs to stay current.  GitHub
Actions started switching that from 20.04 → 22.04 at the beginning of
October¹, but actions/setup-python doesn't support Python 3.6 on Ubuntu
22.04.  This issue was first identified in CI runs for an unrelated PR.²

GitHub Actions documents matrix includes to happen after matrix
excludes, so combining the two works well to replace a specific job.

¹ https://github.blog/changelog/2022-11-09-github-actions-ubuntu-latest-workflows-will-use-ubuntu-22-04/
² #235
@emmahodcroft
Copy link
Member

I feel like I'm only qualified to weigh in a little bit here, as I don't fully understand the technical side, but just to give a few thoughts:

  • I think local, private narrative development is likely important for some people. I know in Edinburgh with HIV they can't put the data or narratives online really ever (they got special permission to temporarily do this for the poster session at VGE but had to take everything down afterwards) - they normally just make these internally & then share the raw files to view locally (ex with their collaborators in Kazakhstan). I imagine others may have similar data restrictions or fears, even if they aren't super common (on top of internet connection stuff, which is also a reality for many). TLDR: I'd hesitate to remove local narrative development functionality completely. Even if it meant 'no pictures just trees' because of the file-things.
  • Adding something to the CLI so that one could more easily do narratives locally would be very cool and make it more accessible!

@jameshadfield
Copy link
Member

jameshadfield commented Nov 22, 2022

I'll summarise my current thoughts here, but I think this has arrived at the point where we need a synchronous meeting to talk through these things. There's no objections from me about the technical possibilities here, but I think we should consider what we want to support / invest in.

Nextstrain CLI using nextstrain.org API handlers

I implemented a prototype of this to develop the /edit/narratives functionality -- https://github.com/nextstrain/auspice/blob/a416642765e15881b025c23967bf2625f444b3e1/src/components/narrativeEditor/updateApiCalls.js#L10-L14 -- it's technically not too hard (actually, I don't know how hard this would be), but I'm ambivalent about whether it's a good idea (I can see some good use cases, but I can see it becoming confusing as well). Let's discuss further!

Public vs Private

I think there's some confusion here which we could clarify more (and it's not really a binary thing either). Using nextstrain.org/edit/narratives page is private in the sense that the narrative doesn't leave your browser window, and allows private Groups datasets to be used (if you're logged in). This is equivalent to viewing your narrative using a version of nextstrain-cli which accesses nextstrain.org datasets [1].

If this is not considered private enough, then storing your datasets or narratives on nextstrain.org (private groups) is not going to be private enough, and probably auspice.us isn't either. You are then in a world where you're running your own auspice server & in that case I think there is a certain level of technical proficiency expected. (To be clear - local narratives development won't go away from Auspice - I just don't think it's the go-to technique we should recommend. This includes local usage of /edit/narratives backed by local files, which I think should already be working! In the future we could expand this to allow users to save narrative changes back to disk.)

GitHub vs Groups

The use case described above (develop markdown by viewing on GitHub web UI, then upload the narrative using nextstrain CLI) is pretty niche to me, but I'd love to hear dissenting experiences! Rather, I think coupling this editing approach with Nextstrain Community sharing is a nicer UX [2,3] as it keeps a single source of truth and avoids out-of-sync issues. This relates to how we're storing our core narratives: yes it's an oddity we're using a GitHub repo (a minus) but it's clear which file is the source of truth (a big plus).

[1] Maybe a nextstrain.org browser has a larger attack surface than a localhost one, but I think they're equivalently secure for this conversation.

[2] But it doesn't support relative images. We've gotten around this by explaining how to use the actual GitHub image path in our tutorials. Relative images here could be solved at a technical level in the nextstrain.org codebase for Community narratives, which would be cool!

[3] And it (Community) doesn't support private repos. In this case I'd still lean towards the /edit/narratives + nextstrain CLI approach.

@tsibley
Copy link
Member Author

tsibley commented Nov 22, 2022

@huddlej wrote:

This seems like a huge step for the CLI! It sounds interesting, although I don't remember discussing this before in our long-term plans for the CLI. Would you be open to talking more about this direction at our next Nextstrain meeting or just for a smaller meeting of people who are interested, @tsibley?

For sure. You hadn't heard it discussed before because it was something I thought of in the shower last night. ;-)

@tsibley
Copy link
Member Author

tsibley commented Nov 22, 2022

@huddlej wrote:

I don't know if you've imagined this use case, although my personal experience is that I either work exclusively on a public narrative through GitHub or on a public/private narrative through Nextstrain Groups, but I never manage a narrative in both a GitHub repository and a group. […] for better or worse, I treat nextstrain remote upload like I do the "Save" button in Word; I want to iterate quickly through minor edits without making separate commits.

When you're done an editing session, do you keep the local file around and pick up from there for your next editing session? Or do you discard the local file and nextstrain remote download it again as/if necessary?

Do you do much collaborative authoring? I'd suspect keeping narratives in version control would be more compelling in that use case (e.g. just like with collaborative manuscript authoring). When version control is involved, nextstrain remote upload (or some web uploader) is more akin to a publishing or releasing step than a checkpoint work-in-progress.

@tsibley
Copy link
Member Author

tsibley commented Nov 22, 2022

@jameshadfield wrote:

I'll summarise my current thoughts here, but I think this has arrived at the point where we need a synchronous meeting to talk through these things.

👍 Yes, would be good to have a smaller work group meeting to hash out the goals (and non-goals) for local narratives and make a cohesive plan between Auspice, nextstrain.org, Nextstrain CLI, etc.

The use case described above (develop markdown by viewing on GitHub web UI, then upload the narrative using nextstrain CLI) is pretty niche to me, but I'd love to hear dissenting experiences!

GitHub might be niche, but it's just an example for a more general use case that seems likely to exist (but again, I don't know if it does). As another example, you can easily imagine someone keeping their narratives in a Dropbox or other shared folder and collaborating on them there, relying on their editor to show a Markdown preview (as many do these days, e.g. VSCode), and then uploading when ready to share.

Rather, I think coupling this editing approach with Nextstrain Community sharing is a nicer UX [2,3] as it keeps a single source of truth and avoids out-of-sync issues.

Community's use of GitHub repos and how files map has seemed confusing to many over time (unless they're already in that niche of using Git). I think the concept of "upload to nextstrain.org" is clarifying compared to the indirection of "put this on GitHub, at this specific place, and then type in a nextstrain.org URL over here". And I think the "author in private, upload to make public" use case is also important.

This relates to how we're storing our core narratives: yes it's an oddity we're using a GitHub repo (a minus) but it's clear which file is the source of truth (a big plus).

I don't think it's an oddity that we're using a repo to track and author core narratives. It's just like how we use repos to track and author core datasets (via the workflows that produce them). And I think it's clearer when what you see on nextstrain.org is what's been uploaded to nextstrain.org, rather than something stored elsewhere but accessed via nextstrain.org. (I'm counting s3://nextstrain-data as part of nextstrain.org here, as it's a mostly internal detail.)

[2] But it doesn't support relative images. We've gotten around this by explaining how to use the actual GitHub image path in our tutorials. Relative images here could be solved at a technical level in the nextstrain.org codebase for Community narratives, which would be cool!

Indeed! We could use a similar approach as here with rewriting the Markdown (or otherwise arranging the correct base URL for resources).

@jameshadfield
Copy link
Member

jameshadfield commented Nov 22, 2022

I'll just add one (final) comment which is to observe that a big part of crafting a narrative is how the dataset will be visualised alongside the slide content. You can get some way there by loading [1] the (dataset) URL in a browser and then looking at the markdown rendering of the slide (in GitHub, VSCode etc etc), but there are some subtle differences (text styling of the markdown, viz panel dimensions), and I think it's a nicer editing experience to see what the slide will actually look like. This was (one of) the motivations for /edit/narratives.

[1] I mean explicitly loading that URL -- getting the viz into a particular state, copy & pasting the URL into the markdown, and then moving on isn't enough as we don't store every bit of UI state in the URL query.

@huddlej
Copy link
Contributor

huddlej commented Nov 23, 2022

When you're done an editing session, do you keep the local file around and pick up from there for your next editing session? Or do you discard the local file and nextstrain remote download it again as/if necessary?

I always treat the version of the narrative stored in the group as the source of truth. When I switch between home and work machines, for example, I always download the remote version before starting new work.

Do you do much collaborative authoring? I'd suspect keeping narratives in version control would be more compelling in that use case (e.g. just like with collaborative manuscript authoring). When version control is involved, nextstrain remote upload (or some web uploader) is more akin to a publishing or releasing step than a checkpoint work-in-progress.

No, but I plan to do more for seasonal flu, at least. I bet you're right that my current workflow will break down the first time @j23414 and I upload slightly different versions of the same narrative without "pulling" first. The frustrating part is that version control would add another layer of complexity by requiring a new private GitHub repo for each private group I'm writing narratives for and the actual authoring process (currently) requires uploading to the group to see the changes reflected, so the potential for conflicts in co-authorship remains. Maybe the better solution would be to author the narrative locally in version control, drag onto narratives debugger to confirm changes, commit and push changes to a parallel private repo (e.g., github.com/nextstrain/nextflu-private), and have a GitHub Action on the repo upload the narrative to the group to avoid conflicts?

tsibley added a commit that referenced this pull request Nov 23, 2022
Builds with this version are seemingly-successful until runtime when
this error is thrown:

    Traceback (most recent call last):
      …
      File "cryptography.hazmat.primitives.asymmetric.ec", line 11, in <module>
      File "cryptography.hazmat._oid", line 7, in <module>
    ImportError: …/build/x86_64-unknown-linux-gnu/debug/installation/lib/cryptography/hazmat/bindings/_rust.abi3.so: undefined symbol: PyExc_BaseException

Why 0.23.0 results in a bad build remains unclear, though I'll continue
to try to track down the cause.  I tested and confirmed it's not the
Python 3.10.4 → 3.10.8 upgrade noted in the 0.23.0 release notes¹.
Maybe it's the SETUPTOOLS_USE_DISTUTILS=stdlib change?

For now, avoiding 0.23.0 means we'll use the previous release, 0.22.0,
which still works well in my local testing and is what we built our last
release, 5.0.1, with.  I'm hopeful the next release of PyOxidizer will
fix the issue; the single version exclusion will help us notice if it
doesn't.

This issue was first identified in CI runs for an unrelated PR.²

¹ https://github.com/indygreg/PyOxidizer/releases/tag/pyoxidizer%2F0.23.0
² #235
@joverlee521
Copy link
Contributor

I bet you're right that my current workflow will break down the first time @j23414 and I upload slightly different versions of the same narrative without "pulling" first.

Maybe we can add a "checkout" system for files (based on my minimal understanding of the Perforce version control system). If one editor has checked out a file, then the other editors can only download the file but cannot update it. This would also work well with the web editor: the web editor would checkout the file while a user is editing so no one else can update the file. Once the user has made the intended edits, then the file can be checked in and the source of truth gets updated.

@tsibley
Copy link
Member Author

tsibley commented Nov 23, 2022

@jameshadfield wrote:

I think it's a nicer editing experience to see what the slide will actually look like.

Absolutely! I don't think anyone's saying otherwise. :-)

@huddlej wrote:

for example, I always download the remote version before starting new work.

Good to know!

I bet you're right that my current workflow will break down the first time @j23414 and I upload slightly different versions of the same narrative without "pulling" first. […] so the potential for conflicts in co-authorship remains.

Perhaps Nextstrain CLI should implement conditional requests (If-Match and/or If-Unmodified-Since) to avoid stomping on unexpected edits. These are supported by the nextstrain.org API side, but not made use of by Nextstrain CLI. As their current support is for caching, there might be more nextstrain.org work needed to make them actually useful for conflict avoidance. One tricky part would be maintaining state on the Nextstrain CLI side.

An in-browser editor would also greatly benefit from conditional requests, so work on the nextstrain.org to support them well would be worth it. And such an editor wouldn't suffer from questions of how to maintain state.

Maybe the better solution would be to author the narrative locally in version control, drag onto narratives debugger to confirm changes, commit and push changes to a parallel private repo (e.g., github.com/nextstrain/nextflu-private), and have a GitHub Action on the repo upload the narrative to the group to avoid conflicts?

This would be one way to do it, a particularly infra-y way. It's certainly not a workflow I'd recommend to someone not steeped in software—the same way I wouldn't recommend authoring a manuscript in a Git repo to someone not steeped in software—but I suspect it'd work well for you and others on our team.

For collaboration specifically, the non-software-y alternative to this way of doing things is using an in-browser editor with conflict avoidance, or ideally, live multi-user editing. (But I suspect the latter is still more effort than we want to expend, even if the experience is light years better.)

@tsibley
Copy link
Member Author

tsibley commented Nov 23, 2022

I missed @joverlee521's reply while writing my own!

Maybe we can add a "checkout" system for files

Locking has some inherent issues around folks/systems forgetting to release the lock (e.g. think cafe bathroom keys that walk off in someone's pocket). There's ways of mitigating the issue (records of who's holding the lock so you can pester them in side channels, automatic lock timeouts, manual lock overrides), but these all require quite a bit more interface and are all nicely avoided with conditional requests. (Also, they're not mutually exclusive, so we could do both.) Locks also mean having to build UI (e.g. on the web and in the CLI) to manually unlock resources, since there will always be times when that's needed and having to email support to do it isn't really tenable.

@tsibley tsibley force-pushed the trs/remote/upload/embed-narrative-images branch from 0179c7b to 5ee192d Compare November 23, 2022 23:35
@tsibley
Copy link
Member Author

tsibley commented Nov 23, 2022

Rebased this PR to fix unrelated CI issues.

tsibley added a commit to nextstrain/auspice that referenced this pull request Dec 1, 2022
…ides

Fixes handling of intra-document references like links and images that
reference a definition found elsewhere in the file, e.g. before the
first slide or on a different slide than the link/image reference.  Such
features are soon to be used for embedding images (see related PR
below).

Anyone previewing a document with such features outside of Auspice's
narrative rendering would see the link/image references work, but they'd
be broken when viewed thru Auspice.  Since narrative documents are
intended to gracefully degrade with conventional Markdown rendering and
still be a viewable document (e.g. on GitHub), it's nice to address this
issue for that reason alone.

It's also nice to use the DOM as a structured way to split apart the
document into slides, rather than relying on string manipulation and
parsing.  An alternative approach would be to use a Markdown parser that
emits an AST or stream of nodes as an intermediate we could split into
slides while still preserving references/definitions, but that seemed
like a bigger change (and I'd have to go survey the parsers available
instead of using the browser's built-in DOM).

While the entire internals of parseNarrativeBody() change, the shape of
the output "slide objects" is the same.  The one minor exception is that
slides will always have a "mainDisplayMarkdown" property (though it will
usually be the empty string) even if they don't have a corresponding
code block in the source text.  All downstream usages I found condition
not on property existence but on truthiness of the value, so this should
result in no functional change.

Resolves: <#1599>
Related-to: <nextstrain/cli#235>
@tsibley tsibley marked this pull request as ready for review December 6, 2022 21:35
@tsibley tsibley force-pushed the trs/remote/upload/embed-narrative-images branch from 5ee192d to 2f37016 Compare December 7, 2022 18:59
@tsibley
Copy link
Member Author

tsibley commented Dec 7, 2022

Rebased onto latest master to fix conflicts, and added a changelog entry.

This permits local authoring of narratives that reference images on the
local filesystem without manual conversion (to remote images or embedded
images) before upload.

Unfortunately, both pyright and mypy (to a lesser degree) have enough
issues type checking the pyparsing API (and thus our usage of it), that
it is more tenable to skip their type checking of the new
nextstrain.cli.markdown module than to work around all their issues.
Hopefully this will be better in the future!  It's possible some of the
issues lay with the type annotations in pyparsing itself, and thus are
more easily improved.

A small number of new tests help ensure the parsing and embedding of
images works as expected, though more should be added with time as we
run across additional real-world test cases.  The framework established
here makes that as quick as adding another Markdown file to the test
data.
…edding

It's often useful to have an escape hatch, especially for things in the
critical path (here, the upload path).  If for any reason some narrative
file finds a bug in the image embedding code and causes a crash or
corrupts the Markdown, we'll appreciate having a way to disable it at
runtime.
@tsibley tsibley force-pushed the trs/remote/upload/embed-narrative-images branch from 86a57df to 4fee60a Compare December 13, 2022 00:59
@tsibley
Copy link
Member Author

tsibley commented Dec 13, 2022

Added 56f0673 and rebased onto latest master to fix conflicts.

I plan to merge tomorrow (and probably cut a new CLI release) unless folks have review comments/objections before then.

CHANGES.md Outdated
@@ -51,6 +51,13 @@ change is described below.

[#240]: https://github.com/nextstrain/cli/pull/240

* Local images used in a narrative are now automatically embedded into it when
uploading the narrative to nextstrain.org via `nextstrain remote upload`.
This permits local authoring of narratives that reference images on the local
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd clarify here that "local authoring" means in a markdown renderer and that relative images will not work in auspice.

@tsibley tsibley merged commit c8d7aa5 into master Dec 13, 2022
@tsibley tsibley deleted the trs/remote/upload/embed-narrative-images branch December 13, 2022 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

5 participants