Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CDATA parsing is not correct when it is not in xmlMode #119

Closed
yukinying opened this issue Feb 24, 2015 · 4 comments
Closed

CDATA parsing is not correct when it is not in xmlMode #119

yukinying opened this issue Feb 24, 2015 · 4 comments

Comments

@yukinying
Copy link

It looks like the parsing code is considering CDATA block as comment block when xmlMode is not set with

this.oncomment("[CDATA[" + value + "]]");

(as in 1123da8).

This logic appears to be incorrect as "HTML bogus comment" blocks would end whenever the first > is seen. Thus for value like > <div>xxx</div>, the parsing logic will be incorrect.

@fb55
Copy link
Owner

fb55 commented Feb 24, 2015

In the context of a bogus comment, this is the right behavior. You can separately enable CDATA parsing, have a look at the options page in the wiki.

@fb55 fb55 closed this as completed Feb 24, 2015
@yukinying
Copy link
Author

I have verified with Gumbo and Parse5 such that the correct behavior for parsing input like <![CDATA[ > <div> xxx </div>]]> would be <![CDATA[ > (bogus comment) <div> (tag), xxx (text), </div> (tag) and ]]> (text).

In htmlparser2 it is parsed as <![CDATA[ > <div> xxx </div>]]> (bogus comment).

@fb55
Copy link
Owner

fb55 commented Feb 25, 2015 via email

@yukinying
Copy link
Author

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants