-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reintroduce support for acronyms in Legislative Lists and Calls To Action #285
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This test is currently failing as abbreviations within Call To Action elements do not currently work (as reported by a user). This functionality will be added in later commits.
This test currently fails as the code to support this has been commented out. Work to fix the issue will be done in a later commit.
In a previous Zendesk ticket (https://govuk.zendesk.com/agent/tickets/4819383), a problem was reported where abbreviations within HTML tag elements were being replaced with abbreviation tags. The fix at the time was to revert the code for adding abbreviations to legislative lists and calls to action. This adds a test for the previously problematic scenario, so we can implement the abbreviations functionality again in a later commit.
This was previously not working (before the code was commented out) as: - multiple calls to `preprocess` meant `@acronyms` was empty on susbsequent runs (using `concat` solves that) - `add_acronym_alt_text` was returning the output of the `each` method (which was some array) rather than the updated HTML Support for other types of element was already commented out and that behaviour has been retained here. It will be fixed in later commits.
This code was previously removed as it was considered unreliable as acronyms within HTML tag attributes were also being replaced. Adding it back in, but now parsing the acronyms before converting to HTML.
1b4be18
to
2088e81
Compare
This adds the `span` that was added to buttons in alphagov/govuk_publishing_components#3478.
2088e81
to
ed07d0b
Compare
d48b871
to
442b0a7
Compare
CristinaRO
approved these changes
Aug 23, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed commit messages and step-by-step walkthrough! I think I understand enough to approve, but I won't be offended if you want to get a second opinion from someone with more experience in this area.
jkempster34
added a commit
that referenced
this pull request
Oct 11, 2023
Kramdown handles abbreviations and footnotes natively. Call to action and legislative list are custom Govspeak components built as extensions to Kramdown. These sections are parsed separately in preprocessing, they are ignored in any subsequent parsing because Kramdown ignores HTML. Because of this, we have added custom handling of abbreviations and footnotes specifically for these components [1]. This custom implementation means that abbreviations defined anywhere in the document will be applied to call to action and legislative list components. Otherwise, only abbreviations defined within these components would be applied. This custom implementation has been shown to contain bugs through several Zendesk tickets. The main issues reported are: 1. Acronyms are inserted into the produced HTML, even if undesirable. For example, a link `<email@example.com>` with the acronym `*[email]: Electronic mail is a method of transmitting and receiving messages` would produce `<p>href="mailto:<abbr title="Electronic mail is a method of transmitting and receiving messages">email</abbr>@example.com"<abbr title="Electronic mail is a method of transmitting and receiving messages">email</abbr>@example.com</p>` rather than the expected link. 2. Because the `add_acronym_alt_text` method runs through the text for each acronym, acronyms can be inserted into other acronyms creating invalid HTML. If we change tack, and instead allow Kramdown to parse these components as part of the main conversion to HTML (instead of in preprocessing), Kramdown will handle abbreviations and footnotes correctly, fixing both of these issues (as well as some other undocumented issues). The call to action component requires a surrounding `div` with the `call-to-action` class. Because Kramdown syntax is normally not processed inside a HTML tag, we must toggle the `parse_block_html` option [2]. There is a small regression in that the `$CTA` block must be proceeded by a blank line (demonstrated in `test/govspeak_test.rb:428`), but this seems okay as it's standard markdown. Some of the tests can likely be removed as we are now testing the functionality of the library, however, for now these represent that there are no regressions. [1]: #285 [2]: https://kramdown.gettalong.org/quickref.html#html-elements
jkempster34
added a commit
that referenced
this pull request
Oct 11, 2023
Kramdown handles abbreviations and footnotes natively. Call to action and legislative list are custom Govspeak components built as extensions to Kramdown. These sections are parsed separately in preprocessing, they are ignored in any subsequent parsing because Kramdown ignores HTML. Because of this, we have added custom handling of abbreviations and footnotes specifically for these components [1]. This custom implementation means that abbreviations defined anywhere in the document will be applied to call to action and legislative list components. Otherwise, only abbreviations defined within these components would be applied. This custom implementation has been shown to contain bugs through several Zendesk tickets. The main issues reported are: 1. Acronyms are inserted into the produced HTML, even if undesirable. For example, a link `<email@example.com>` with the acronym `*[email]: Electronic mail is a method of transmitting and receiving messages` would produce `<p>href="mailto:<abbr title="Electronic mail is a method of transmitting and receiving messages">email</abbr>@example.com"<abbr title="Electronic mail is a method of transmitting and receiving messages">email</abbr>@example.com</p>` rather than the expected link. 2. Because the `add_acronym_alt_text` method runs through the text for each acronym, acronyms can be inserted into other acronyms creating invalid HTML. If we change tack, and instead allow Kramdown to parse these components as part of the main conversion to HTML (instead of in preprocessing), Kramdown will handle abbreviations and footnotes correctly, fixing both of these issues (as well as some other undocumented issues). The legislative list component currently works by overriding Kramdowns `list` extension [2] and disabling ordered lists [3]. So that we can parse this component as part of the main conversion, this applies two options: `parse_block_html` [4], so that markdown inside the `legislative-list-wrapper` `div` is parsed by Kramdown, and a new custom option of `ordered_lists_disabled`, so that we can control flow inside our `Parser::Govuk` class (subclass of the Kramdown parser). It is possible to override `parse_list` instead of `parse_block_html`, but this means that the outermost list will not be parsed (by the time `parse_list` is called the outermost list is already being parsed) The only difference between this new iteration of the legislative list and the previous iteration, is that we now wrap the component in a `legislative-list-wrapper` `div`. We could remove this in postprocessing, but this requires additional modification of the Kramdown produced HTML which seems unnecessary. This also allows us to remove all remaining code which replicates the footnote and acronym functions of Kramdown. Some of the tests can likely be removed as we are now testing the functionality of the library, however, for now these represent that there are no regressions. [1]: #285 [2]: https://github.com/gettalong/kramdown/blob/bd678ecb59f70778fdb3b08bdcd39e2ab7379b45/lib/kramdown/parser/kramdown/list.rb#L54 [3]: #25 [4]: https://kramdown.gettalong.org/quickref.html#html-elements
jkempster34
added a commit
that referenced
this pull request
Oct 12, 2023
Kramdown handles abbreviations and footnotes natively. Call to action and legislative list are custom Govspeak components built as extensions to Kramdown. These sections are parsed separately in preprocessing, they are ignored in any subsequent parsing because Kramdown ignores HTML. Because of this, we have added custom handling of abbreviations and footnotes specifically for these components [1]. This custom implementation means that abbreviations defined anywhere in the document will be applied to call to action and legislative list components. Otherwise, only abbreviations defined within these components would be applied. This custom implementation has been shown to contain bugs through several Zendesk tickets. The main issues reported are: 1. Acronyms are inserted into the produced HTML, even if undesirable. For example, a link `<email@example.com>` with the acronym `*[email]: Electronic mail is a method of transmitting and receiving messages` would produce `<p>href="mailto:<abbr title="Electronic mail is a method of transmitting and receiving messages">email</abbr>@example.com"<abbr title="Electronic mail is a method of transmitting and receiving messages">email</abbr>@example.com</p>` rather than the expected link. 2. Because the `add_acronym_alt_text` method runs through the text for each acronym, acronyms can be inserted into other acronyms creating invalid HTML. If we change tack, and instead allow Kramdown to parse these components as part of the main conversion to HTML (instead of in preprocessing), Kramdown will handle abbreviations and footnotes correctly, fixing both of these issues (as well as some other undocumented issues). The call to action component requires a surrounding `div` with the `call-to-action` class. Because Kramdown syntax is normally not processed inside a HTML tag, we must toggle the `parse_block_html` option [2]. There is a small regression in that the `$CTA` block must be preceded by a blank line (demonstrated in `test/govspeak_test.rb:428`), but this seems okay as it's standard markdown. Some of the tests can likely be removed as we are now testing the functionality of the library, however, for now these represent that there are no regressions. [1]: #285 [2]: https://kramdown.gettalong.org/quickref.html#html-elements
jkempster34
added a commit
that referenced
this pull request
Oct 12, 2023
Kramdown handles abbreviations and footnotes natively. Call to action and legislative list are custom Govspeak components built as extensions to Kramdown. These sections are parsed separately in preprocessing, they are ignored in any subsequent parsing because Kramdown ignores HTML. Because of this, we have added custom handling of abbreviations and footnotes specifically for these components [1]. This custom implementation means that abbreviations defined anywhere in the document will be applied to call to action and legislative list components. Otherwise, only abbreviations defined within these components would be applied. This custom implementation has been shown to contain bugs through several Zendesk tickets. The main issues reported are: 1. Acronyms are inserted into the produced HTML, even if undesirable. For example, a link `<email@example.com>` with the acronym `*[email]: Electronic mail is a method of transmitting and receiving messages` would produce `<p>href="mailto:<abbr title="Electronic mail is a method of transmitting and receiving messages">email</abbr>@example.com"<abbr title="Electronic mail is a method of transmitting and receiving messages">email</abbr>@example.com</p>` rather than the expected link. 2. Because the `add_acronym_alt_text` method runs through the text for each acronym, acronyms can be inserted into other acronyms creating invalid HTML. If we change tack, and instead allow Kramdown to parse these components as part of the main conversion to HTML (instead of in preprocessing), Kramdown will handle abbreviations and footnotes correctly, fixing both of these issues (as well as some other undocumented issues). The legislative list component currently works by overriding Kramdowns `list` extension [2] and disabling ordered lists [3]. So that we can parse this component as part of the main conversion, this applies two options: `parse_block_html` [4], so that markdown inside the `legislative-list-wrapper` `div` is parsed by Kramdown, and a new custom option of `ordered_lists_disabled`, so that we can control flow inside our `Parser::Govuk` class (subclass of the Kramdown parser). It is possible to override `parse_list` instead of `parse_block_html`, but this means that the outermost list will not be parsed (by the time `parse_list` is called the outermost list is already being parsed) The only difference between this new iteration of the legislative list and the previous iteration, is that we now wrap the component in a `legislative-list-wrapper` `div`. We could remove this in postprocessing, but this requires additional modification of the Kramdown produced HTML which seems unnecessary. This also allows us to remove all remaining code which replicates the footnote and acronym functions of Kramdown. Some of the tests can likely be removed as we are now testing the functionality of the library, however, for now these represent that there are no regressions. [1]: #285 [2]: https://github.com/gettalong/kramdown/blob/bd678ecb59f70778fdb3b08bdcd39e2ab7379b45/lib/kramdown/parser/kramdown/list.rb#L54 [3]: #25 [4]: https://kramdown.gettalong.org/quickref.html#html-elements
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This adds support for acronyms in Legislative Lists and Calls to Action, which was removed in #229.
The removal was because acronyms that matched parts of the HTML tags/attributes were also being replaced with acronym markup.
Therefore resolving this by searching for acronyms before converting to HTML.
Trello card