Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FastVectorHighlighter doesn't sort array types by score but Plain highlighter does #3757

Closed
nik9000 opened this issue Sep 21, 2013 · 6 comments

Comments

@nik9000
Copy link
Member

nik9000 commented Sep 21, 2013

The FastVectorHighlighter doesn't sort array types by score but Plain highlighter does: https://gist.github.com/nik9000/6645681

The plain highlighter's behaviour seems more correct.

@nik9000
Copy link
Member Author

nik9000 commented Sep 23, 2013

After poking around at this I've figured out a few things:

  1. This is a Lucene behaviour so it'd have to be fixed upstream/mirrored in Elasticsearch and submitted upstream.
  2. It all comes down to the scoring phase being done without access to the array boundaries.
  3. You can't use position_offset_gap to make a big gap because that set term position which isn't used to figure out highlighting segments. That uses offsets.

I could live with closing this issue and amending http://www.elasticsearch.org/guide/reference/api/search/highlighting/ to ward people of trying the FVH for score ordered array fields.

I imagine the FVH might be faster to highlight array's with long values or lots of values but really, I have no way to test it.

@nik9000
Copy link
Member Author

nik9000 commented Oct 15, 2013

Filed LUCENE-5285 which I'll have a look at when I get a chance.

@nik9000
Copy link
Member Author

nik9000 commented Oct 25, 2013

I've posted a solution to this over on the Lucene bug. I'd love to get this fixed so I can use #3750 on all of my fields. Well, once that is implemented.

@nik9000
Copy link
Member Author

nik9000 commented Nov 5, 2013

@jpountz, would it be possible for you to have a look at LUCENE-5285, this bug's Lucene brother? This wouldn't be a big deal but if I'm going to be able to use #3750 on all of my fields I'll need this fixed too. I've posted a patch over there which I think should do the job. This is a lot less complicated than #3750 was, I think.

@nik9000
Copy link
Member Author

nik9000 commented Dec 3, 2013

@jpountz got this in to Lucene 4.7 so when that is released and Elasticsearch updates to it this will be fixed for free. I don't feel it is worth X classing it into Elasticsearch at this point. Thanks!

@jpountz
Copy link
Contributor

jpountz commented Dec 3, 2013

Agreed. I added the "Lucene 4.7 Upgrade" label so that we don't forget to close this issue when we switch to Lucene 4.7.

s1monw added a commit that referenced this issue Feb 19, 2014
Closes #5104
Closes #5129
Closes #3757
jpountz pushed a commit to jpountz/elasticsearch that referenced this issue Feb 26, 2014
@s1monw s1monw closed this as completed in 30d7b8d Feb 26, 2014
s1monw added a commit that referenced this issue Feb 26, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants