-
-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix DataOutputAgent so that it can output items with multiple categories #2110
Conversation
The to_xml method encodes `{ "category": ["a", "b"] }` as follows: ```xml <item> <category> <category>a</category> <category>b</category> </category> </item> ``` Instead of this: ```xml <item> <category>a</category> <category>b</category> </item> ``` Even if `category` is a singular noun. This feature prevents DataOutputAgent from emitting multiple `<category>` elements (or `<enclosure>`, etc.) properly, so I've added a tweak to fix the resulted XML document. I know the code in the current form is far from optimal, so I think we'll have to revisit here soon or later...
c80b0fc
to
ef03f9f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That makes total sense, I think we have a few related issues/feature requests in the past.
Turns out one can parse XML with regular expressions 😉 The worst that could happen is that the regex is not matching and the XML is not transformed as expected, right? If you see any potential that the gsub
chain could raise an exception it might be worth rescuing it and returning the XML generated by to_xml
.
Thanks for the review. These gsub calls should be safe and would never raise exceptions. |
I think we should implement a new XML generator that generates a DOM tree you can manipulate before serializing to the final string form. The |
That sounds like a good idea but also some work 😄 This could also allow us to add special tags to embed HTML/XML in CTAGS |
I have a question about the current implementation here. As far as I can see, arrays are only handled as described above if they are defined in the agent configuration at design time. Is this by design or am I doing something wrong? |
@dsander Thanks for your reply, that's what I was looking for! Didn't know about this filter. |
The to_xml method encodes
{ "category": ["a", "b"] }
as follows:Instead of this:
Even if
category
is a singular noun. This feature prevents DataOutputAgent from emitting multiple<category>
elements (or<enclosure>
, etc.) properly, so I've added a tweak to fix the resulted XML document.I know the code in the current form is far from optimal, so I think we'll have to revisit here soon or later...