-
Notifications
You must be signed in to change notification settings - Fork 340
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
delete() is not escaping special XML characters in its query or ID parameters. #357
Comments
This is an interesting find: I think it would be more consistent to make https://github.com/django-haystack/pysolr/blob/master/pysolr.py#L1122 use ElementTree like the rest of code, or possibly to switch to using the JSON request format since that's usually faster and it should be supported by all but truly ancient versions of Solr at this point. |
I thought about that but was trying to make minimal changes. I'm happy to re-submit with either of the other approaches instead. |
I submitted [#358] which has the same functionality and tests as [#356] but uses ElementTree to generate the XML. This feels safer to me than switching to JSON, since in most cases (those with no special characters involved), solr will be receiving the exact same messages that it always has with earlier versions of pysolr. If people did prefer JSON over XML, I'd be tempted to submit an entirely separate pull request that switches all the API calls (add/update, etc) to JSON too - but think that's out of scope for this issue. Oh and we also don't pass unvalidated user input -- but we did have deletes by query on fields that had organization names in them. Annoyingly there were XML special characters in quite a few of those ("Smith & Wesson", etc.). Investigating why those failed led to noticing that delete by ID had a similar problem. |
I noticed my pull request failed the flake8 test (missing whitespace and code too long) -- I'll submit a cleaner version today. |
I merged #358 on the theory that it's consistent with what we're doing for everything else. I do like the idea of switching it to JSON but that should probably be part of a phased deprecation. Thanks for this, it's a good improvement! |
I have
Expected behaviour
The statement
should delete a document with the ID that has a string value of "cats<dogs", and the statement
should only delete a document with an ID that has the string value of
doc_4</id><id>doc_3</id><id>doc_2</id><id>doc_1
and the statementshould not delete all documents.
Actual behaviour
Steps to reproduce the behaviour
Configuration
The text was updated successfully, but these errors were encountered: