Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix potential edge case scoring in context search #474

Merged
merged 4 commits into from
Feb 8, 2024

Conversation

coszio
Copy link
Contributor

@coszio coszio commented Jan 31, 2024

Follows the same fix that was made in core (qdrant/qdrant#3374) for congruence

All Submissions:

  • Contributions should target the dev branch. Did you create your branch from dev?
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  1. Does your submission pass tests?
  2. Have you installed pre-commit with pip3 install pre-commit and set up hooks with pre-commit install?

Changes to Core Features:

  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your core changes, as applicable?
  • Have you successfully ran tests with your changes locally?

@coszio coszio requested a review from joein January 31, 2024 16:02
Copy link

netlify bot commented Jan 31, 2024

Deploy Preview for poetic-froyo-8baba7 ready!

Name Link
🔨 Latest commit f30f523
🔍 Latest deploy log https://app.netlify.com/sites/poetic-froyo-8baba7/deploys/65c516e9d8d9490008482d4f
😎 Deploy Preview https://deploy-preview-474--poetic-froyo-8baba7.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

def fast_sigmoid(x: np.float32) -> np.float32:
if np.isfinite(x):
return x / (1.0 + abs(x))
else:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small thing:

You can remove the else and directly do return X no?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, we can avoid the redundant else statement.

expit

We use the same function as in core implementation to get scoring congruence

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see

@@ -147,13 +147,21 @@ def calculate_distance_core(
return calculate_distance(query, vectors, distance_type)


def fast_sigmoid(x: np.float32) -> np.float32:
if not np.isfinite(x):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused by it, I know, that we have had such a check for quite some time now though..

Do we wait for a particular not finite type here?

Like -inf, +inf, nan ? (if we wait -inf, won't it be better to return a constant value?)

It seems for me that at the moment, If vector has a non-finite difference with any of context pairs, then it is excluded from result search, because it will have non-finite score in overall_scores

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I don't know if I understand well, but if x equal +inf or -inf the function return x is good or you want to return for example 2 or 3 ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, in those cases we should return the limit of the function, which should be -1 or +1

Comment on lines 152 to 155
if np.isnan(x):
# To avoid divisions on NaNs, which gets: RuntimeWarning: invalid value encountered in scalar divide
return x # NaN

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to just hide the warning? :D

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rust implementation also returns NaN when dividing by NaN, so I'd say it is safe to return it here too

@joein joein self-requested a review February 8, 2024 19:59
@joein joein merged commit 69ed66e into dev Feb 8, 2024
9 of 14 checks passed
joein pushed a commit that referenced this pull request Mar 5, 2024
* apply fast_sigmoid fn to context pair score

* remove redundant else statement

* better NaN and float32 handling

* remove unused import
@generall generall deleted the fix-context-search-scoring branch May 3, 2024 10:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants