-
-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[bug] v1.14.0 introduces inconsistency between CRuby and JRuby #2761
Comments
Thanks for reporting, I'll take a look after the weekend. |
@dometto Thanks again for reporting. I think what's going on here is that you're toggling a bit in the SaveOptions bitmask, instead of explicitly turning it on or off, and because the underlying bits have changed, this is doing the opposite of what you want. Let me explain what's going on, step by step ... First, in JRuby Nokogiri v1.13.0: RUBY_DESCRIPTION # => "jruby 9.4.0.0 (3.1.0) 2022-11-23 95c0ec159f OpenJDK 64-Bit Server VM 11.0.17+8-post-Ubuntu-1ubuntu220.04 on 11.0.17+8-post-Ubuntu-1ubuntu220.04 [x86_64-linux]"
Nokogiri::VERSION # => "1.13.10"
Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML # => 19
Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML ^ 1 # => 18 But in Nokogiri v1.14.0: RUBY_DESCRIPTION # => "jruby 9.4.0.0 (3.1.0) 2022-11-23 95c0ec159f OpenJDK 64-Bit Server VM 11.0.17+8-post-Ubuntu-1ubuntu220.04 on 11.0.17+8-post-Ubuntu-1ubuntu220.04 [x86_64-linux]"
Nokogiri::VERSION # => "1.14.0"
Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML # => 18
Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML ^ 1 # => 19 The value of Removing :save_with => Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML ^ 1 which with Nokogiri 1.13.10 turns off the A better way to do this would be to avoid toggling the bit with the Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML & (~Nokogiri::XML::Node::SaveOptions::FORMAT) Putting it all together, here we see this is backwards-compatible with v1.13.x: Nokogiri::VERSION # => "1.13.10"
data = "<div>\n<pre><code>one\ntwo\nthree\nfour\n</code></pre>\n</div>"
doc = Nokogiri::XML::DocumentFragment.parse(data)
opts = { :save_with => Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML ^ 1, :indent => 0, :encoding => 'UTF-8' }
# => {:save_with=>18, :indent=>0, :encoding=>"UTF-8"}
doc.css("div").children.to_xml(opts)
# => "\n" +
# "<pre><code>one\n" +
# "two\n" +
# "three\n" +
# "four\n" +
# "</code></pre>\n"
opts = { :save_with => Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML & (~Nokogiri::XML::Node::SaveOptions::FORMAT), :indent => 0, :encoding => 'UTF-8' }
# => {:save_with=>18, :indent=>0, :encoding=>"UTF-8"}
doc.css("div").children.to_xml(opts)
# => "\n" +
# "<pre><code>one\n" +
# "two\n" +
# "three\n" +
# "four\n" +
# "</code></pre>\n" and works properly in Nokogiri v1.14.x: Nokogiri::VERSION # => "1.14.0"
data = "<div>\n<pre><code>one\ntwo\nthree\nfour\n</code></pre>\n</div>"
doc = Nokogiri::XML::DocumentFragment.parse(data)
opts = { :save_with => Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML ^ 1, :indent => 0, :encoding => 'UTF-8' }
# => {:save_with=>19, :indent=>0, :encoding=>"UTF-8"}
doc.css("div").children.to_xml(opts)
# => "\n" +
# "<pre>\n" +
# " <code>one\n" +
# "two\n" +
# "three\n" +
# "four\n" +
# "</code>\n" +
# "</pre>\n" +
# "\n"
opts = { :save_with => Nokogiri::XML::Node::SaveOptions::DEFAULT_XHTML & (~Nokogiri::XML::Node::SaveOptions::FORMAT), :indent => 0, :encoding => 'UTF-8' }
# => {:save_with=>18, :indent=>0, :encoding=>"UTF-8"}
doc.css("div").children.to_xml(opts)
# => "\n" +
# "<pre><code>one\n" +
# "two\n" +
# "three\n" +
# "four\n" +
# "</code></pre>\n" But really, since Nokogiri v1.14.0 is doing what you want already, you should be able to simplify to: Nokogiri::VERSION # => "1.14.0"
opts = { :indent => 0, :encoding => 'UTF-8' }
# => {:indent=>0, :encoding=>"UTF-8"}
doc.css("div").children.to_xml(opts)
# => "\n" +
# "<pre><code>one\n" +
# "two\n" +
# "three\n" +
# "four\n" +
# "</code></pre>\n" I hope all that makes sense, and I apologize for the confusion caused by changing the value of this options mask. The intention behind the change to was to have better and more consistent defaults. |
Thanks you for the very clear explanation and the help, @flavorjones! Much appreciated. ❤️ And apologies for not looking closer: I didn't notice the way I was setting the property was not robust relative to changes in the values of the constants nokogiri defines. |
5.2.4 / 2023-03-22 * Bugfix release: address XSS vulnerability ( @6661620a, @dometto) 5.2.3 / 2023-03-13 * Bugfix release: update adapter dependencies for Ruby 3.2 support. 5.2.2 / 2023-01-18 * Bugfix release: set Nokogiri default XHTML conversion options more relaibly. See sparklemotion/nokogiri#2761 5.2.1 / 2022-09-13 * Fixed: 'controls' attribute in audio and video tags should not be sanitized. #430 (@dometto) 5.2 / 2022-05-28 * Conditionally render "editable" heading classes. Resolves gollum/gollum#1785 (@benjaminwil) * Improvement: Allow escaping quotations in quoted macro arguments. #406. (@srbaker) * Improvement: [Allow for extended PlantUML types](https://github.com/gollum/gollum/wiki#plantuml-diagrams). #413. (@manofstick) * API Addition: allow defining handlers for specific languages in codeblocks. #410. (@dometto) v5.1.2 * SECURITY UPDATE: sanitize HTML generated by Macros. v5.0.1 * Bugfix release: fix emoji when using base path. Thanks to @heavywatal. v5.0 For a detailed overview of changes in 5.0 and a guide to migrating your wiki, see https://github.com/gollum/gollum/wiki/5.0-release-notes * Removed support for Web Sequence Diagroms, PlantUML now default. ** PlantUML users in 4.x please note: in this release PlantUML uses the server at https://plantuml.com by default, not `localhost`. Use the config option to keep using your own server. v4.2.1 * Performances improvements * Dependency updates ruby-gollum-lib: 5.2.4
First off, thanks for working on nokogiri! The latest release looks like it has some great improvements. Unfortunately, here's a bug. :)
Since
v1.14.0
, the following code has different results on CRuby and JRuby (where CRuby shows the correct, and previous, behaviour):Result on CRuby:
"\n<pre><code>one\ntwo\nthree\nfour\n</code></pre>\n"
Result on JRuby:
"\n<pre>\n <code>one\ntwo\nthree\nfour\n</code>\n</pre>\n\n"
That is: on JRuby, a newline and two spaces are mistakenly added inside the
<pre>
tag, despite using:indent => 0
and despite the fact that:save_with
hasFORMAT
disabled in both cases.The text was updated successfully, but these errors were encountered: