Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup ci #489

Merged
merged 57 commits into from
Mar 6, 2025
Merged

Speedup ci #489

merged 57 commits into from
Mar 6, 2025

Conversation

hh-space-invader
Copy link
Contributor

All Submissions:

  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same update/change?

New Feature Submissions:

  • Does your submission pass the existing tests?
  • Have you added tests for your feature?
  • Have you installed pre-commit with pip3 install pre-commit and set up hooks with pre-commit install?

New models submission:

  • Have you added an explanation of why it's important to include this model?
  • Have you added tests for the new model? Were canonical values for tests computed via the original model?
  • Have you added the code snippet for how canonical values were computed?
  • Have you successfully ran tests with your changes locally?

Copy link

coderabbitai bot commented Feb 27, 2025

📝 Walkthrough

Walkthrough

The changes update both the CI configuration and the test suites. In the workflow file, the event triggers have been adjusted by removing the automatic trigger on pushes and shifting the execution to pull requests targeting specific branches, alongside adding an option for manual execution. Across the various test files, several tests now incorporate dynamic behavior based on the execution context (manual trigger or CI) by introducing environment checks. The testing logic has been streamlined by replacing internal loops with parameterized tests for functions dealing with image, text, and cross embeddings. Modifications include updates to method signatures, conditional selection of models based on their presence in specific lists, and refined assertions to verify embedding outputs. Additionally, minor corrections were made to variable names. A new utility function, should_test_model, has been introduced to assist in determining model eligibility for testing. These changes enhance flexibility and clarity in test execution across different scenarios without altering the core functionality.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 3826294 and bf9bb25.

📒 Files selected for processing (8)
  • tests/test_image_onnx_embeddings.py (3 hunks)
  • tests/test_late_interaction_embeddings.py (3 hunks)
  • tests/test_late_interaction_multimodal.py (2 hunks)
  • tests/test_sparse_embeddings.py (4 hunks)
  • tests/test_text_cross_encoder.py (4 hunks)
  • tests/test_text_multitask_embeddings.py (7 hunks)
  • tests/test_text_onnx_embeddings.py (5 hunks)
  • tests/utils.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • tests/test_text_cross_encoder.py
  • tests/test_text_onnx_embeddings.py
  • tests/test_sparse_embeddings.py
  • tests/test_text_multitask_embeddings.py
⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on windows-latest test
  • GitHub Check: Python 3.12.x on macos-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on windows-latest test
  • GitHub Check: Python 3.11.x on macos-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on windows-latest test
  • GitHub Check: Python 3.10.x on macos-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test
🔇 Additional comments (12)
tests/utils.py (1)

42-67: Well-structured utility function for conditional model testing.

The should_test_model function provides a clear, centralized way to determine which models should be tested in different environments. The implementation logic correctly handles the three testing scenarios described in the docstring.

However, consider extracting the hardcoded threshold value 1 into a named constant:

+# Maximum model size in GB to test in non-CI environments
+MAX_LOCAL_MODEL_SIZE_GB = 1
+
 def should_test_model(
     model_desc: BaseModelDescription,
     autotest_model_name: str,
     is_ci: Optional[str],
     is_manual: bool,
 ):
     """Determine if a model should be tested based on environment
     ...
     """
     if not is_ci:
-        if model_desc.size_in_GB > 1:
+        if model_desc.size_in_GB > MAX_LOCAL_MODEL_SIZE_GB:
             return False
     elif not is_manual and model_desc.model != autotest_model_name:
         return False
     return True
tests/test_image_onnx_embeddings.py (3)

30-37: Good use of parameterization and conditional testing logic.

The test function now correctly uses pytest parameterization and the new should_test_model utility to selectively test models based on environment context. This approach makes the tests more efficient, especially in CI environments.


59-60: Added assertion verifies consistent embedding outputs.

This assertion ensures that embeddings produced from different image formats (file path vs opened image) are consistent, which is a valuable addition to the test coverage.


79-84: Added canonical vector verification in batch embedding test.

The additional assertions provide stronger validation by ensuring that batch embedding results match the expected canonical vectors, strengthening the test coverage.

tests/test_late_interaction_multimodal.py (3)

49-50: Improved test skip logic using pytest.skip().

Using pytest.skip() at the beginning of the test function improves readability by separating the skip condition from the test logic itself. This makes the code cleaner and avoids unnecessary nesting.


63-64: Consistent application of test skip pattern.

The same clean pattern is applied across all test functions, maintaining consistency in how tests are conditionally executed based on environment.


75-76: Consistent application of test skip pattern.

The same pattern is applied here as well, completing the refactoring across all functions in this file.

tests/test_late_interaction_embeddings.py (5)

156-170: Parameterized test with fixed model for reliable CI testing.

The test is now parameterized with a specific small model (answerdotai/answerai-colbert-small-v1) rather than dynamically selecting models. This ensures consistent and reliable testing in CI environments.

This change addresses previous concerns about potential test failures when models were selected dynamically.


173-191: Improved model selection with environment-aware filtering.

The test now properly uses the should_test_model utility to selectively test models based on environment context and manual trigger status. This avoids potential StopIteration errors that existed in the previous implementation and ensures appropriate model selection.


194-212: Fixed canonical vector comparison for query embeddings.

The test now correctly uses CANONICAL_QUERY_VALUES for validation of query embeddings, fixing a previous issue where it may have used the wrong canonical values array.


215-235: Parameterized parallel processing test with explicit model specification.

The parallel processing test is now parameterized with both token dimension and model name, which makes the test more explicit and maintainable. The hardcoded model name ensures consistent testing results.


238-256: Updated lazy_load test to use smaller model.

The test_lazy_load function now uses the smaller answerdotai/answerai-colbert-small-v1 model instead of the previously used larger model, which helps with test performance and consistency, especially in CI environments.

✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
.github/workflows/python-tests.yml (2)

7-7: Fix trailing whitespace issues.

There are trailing whitespace characters on lines 7 and 46 that should be removed.

-  
+
-poetry run pytest 
+poetry run pytest

Also applies to: 46-46

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 7-7: trailing spaces

(trailing-spaces)


46-46: Add missing newline at end of file.

According to best practices and YAML linting rules, files should end with a newline character.

-poetry run pytest 
+poetry run pytest
+
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 46-46: no new line character at the end of file

(new-line-at-end-of-file)


[error] 46-46: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6cda2ce and 640f106.

📒 Files selected for processing (7)
  • .github/workflows/python-tests.yml (2 hunks)
  • tests/test_image_onnx_embeddings.py (2 hunks)
  • tests/test_late_interaction_embeddings.py (2 hunks)
  • tests/test_sparse_embeddings.py (1 hunks)
  • tests/test_text_cross_encoder.py (2 hunks)
  • tests/test_text_multitask_embeddings.py (4 hunks)
  • tests/test_text_onnx_embeddings.py (4 hunks)
🧰 Additional context used
🪛 YAMLlint (1.35.1)
.github/workflows/python-tests.yml

[error] 7-7: trailing spaces

(trailing-spaces)


[error] 46-46: no new line character at the end of file

(new-line-at-end-of-file)


[error] 46-46: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on windows-latest test
  • GitHub Check: Python 3.12.x on macos-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on windows-latest test
  • GitHub Check: Python 3.11.x on macos-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on windows-latest test
  • GitHub Check: Python 3.10.x on macos-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (27)
tests/test_text_onnx_embeddings.py (4)

78-78: Good addition to optimize test execution based on trigger context.

The is_manual variable provides a way to differentiate between automated CI runs and manual executions, which allows for different testing strategies based on the context.


80-90: Excellent optimization for CI execution.

This change intelligently selects which models to test based on the execution context:

  • During regular CI runs: only tests the first model (minimal testing)
  • During manual workflow runs: tests all supported models (comprehensive testing)
  • Properly skips models not present in the canonical values

This approach significantly speeds up CI execution while still allowing thorough testing when needed.


104-104: Format string access correction.

Changed from dictionary-style access model_desc["model"] to attribute access model_desc.model, which is more appropriate for the object type.


109-109: Good test parameterization and dynamic assertion improvements.

The changes include:

  1. Simplified test parameterization to use a single model
  2. Updated assertions to use dynamic len(docs) instead of hardcoded values

These improvements make the tests more maintainable and less prone to breaking when input data changes.

Also applies to: 118-118, 125-126, 141-141

.github/workflows/python-tests.yml (1)

3-7: Enhanced workflow trigger configuration.

The changes improve the workflow trigger configuration by:

  1. Removing auto-trigger on push events
  2. Specifying target branches for pull_request events
  3. Adding workflow_dispatch for manual triggering

This aligns with the test optimizations and allows for manual comprehensive testing.

🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 7-7: trailing spaces

(trailing-spaces)

tests/test_image_onnx_embeddings.py (2)

32-42: Well-structured optimization for image embedding tests.

This change follows the same pattern as the text embedding tests:

  • Uses is_manual to detect workflow trigger context
  • Tests only the first model in regular CI runs
  • Tests all supported models in manual runs
  • Properly skips models based on size and availability in canonical values

This provides a good balance between CI speed and comprehensive testing.


84-89: Enhanced test validation for batch embedding.

The additional validations improve test coverage by:

  1. Verifying that embeddings for similar inputs are close to each other
  2. Checking that the first embedding matches the expected canonical values
  3. Using a dynamic shape assertion that adapts to input size

These checks strengthen the test quality while maintaining compatibility with the optimization changes.

tests/test_sparse_embeddings.py (2)

69-78: Consistent optimization pattern for sparse embedding tests.

The sparse embedding tests have been updated with the same optimization pattern:

  • Adding is_manual check for workflow context detection
  • Limiting tests to the first model in regular CI runs
  • Testing all models in manual runs
  • Skipping models based on size and availability in canonical values

This consistent approach across test files improves maintainability and CI performance.


83-83: Dynamic expected result retrieval.

Now fetches the expected test result dynamically based on the current model being tested, making the test more adaptable when iterating through multiple models.

tests/test_text_cross_encoder.py (5)

21-21: Smart optimization for test execution context.

Adding this check for manual workflow dispatch allows the tests to behave differently based on the execution context. This is important for CI speedup.


23-24: Good approach for flexible model testing.

Using _list_supported_models() to dynamically determine available models and then selecting a subset based on execution context is an excellent strategy. This allows comprehensive testing during manual runs while keeping CI runs efficient.


26-48: Well-structured test iteration logic.

The loop-based approach with conditional filtering provides a good balance between test coverage and performance. The conditions for skipping larger models or those without canonical values enhance CI efficiency while maintaining test reliability.


54-54: Focused test parameterization for CI performance.

Reducing the parameterization to a single model instead of multiple models will significantly speed up CI runs while still maintaining essential test coverage with a representative model.


100-100: Consistent simplification of test parameterization.

This change aligns with the previous parameterization updates, maintaining consistency throughout the test suite and contributing to the overall CI speedup goal.

tests/test_text_multitask_embeddings.py (6)

63-66: Effective parameterization for test consistency.

Using pytest's parameterization decorator with specific test values reduces test execution time and increases maintainability. This approach allows for more controlled test scenarios with explicit parameters.


67-87: Streamlined test logic with parametrized input.

The function now accepts explicit parameters rather than relying on internal logic to determine what to test. This improves readability and makes the test behavior more predictable and faster.


92-96: Consistent context-aware testing strategy.

The manual execution check is aligned with the project-wide approach of optimizing test execution based on context. The selective model testing significantly reduces CI run time.


127-132: Applied context-awareness pattern consistently.

This change follows the same pattern used in other test functions, ensuring consistency throughout the test suite. The conditional model selection helps avoid testing large models in CI environments.


162-172: Maintained testing pattern consistency.

The same context-aware testing pattern is applied here, maintaining a consistent approach across all test functions. This consistency makes the code more maintainable and predictable.


195-199: Effective parameterization for performance testing.

The parameterization of this test follows the same pattern established earlier, providing consistent test behavior with better control over what gets tested in different environments.

tests/test_late_interaction_embeddings.py (7)

156-157: Simplified test parameterization for better CI performance.

Using a single model instead of iterating through multiple models will significantly reduce test execution time while still validating core functionality.


161-170: Streamlined test implementation with focused assertions.

The simplified implementation with direct model testing eliminates unnecessary iterations while maintaining the original test objectives. The code remains clear and focused.


175-180: Consistent application of context-aware testing pattern.

This change aligns with the pattern used in other test files, creating a unified approach to test selection based on execution context. This consistency is important for maintainability.


200-211: Maintained context-aware testing strategy.

Implementing the same selective testing approach ensures consistent behavior across test functions and files, contributing to the overall CI optimization goals.


215-215: Small inconsistency in expected result variable.

The test is using CANONICAL_COLUMN_VALUES instead of CANONICAL_QUERY_VALUES even though this is a query embedding test.

It seems this test is intended to verify a query embedding against column values. Please verify this is the intended behavior, as typically query embeddings would be verified against query-specific canonical values.


223-226: Effective parameterization for performance testing.

Using explicit parameterization with specific values makes the test more predictable and faster to execute, which is aligned with the CI optimization goals.


251-251: Updated model selection for consistency.

Changing the model to answerdotai/answerai-colbert-small-v1 aligns this test with the models used in other tests, providing consistency across the test suite.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 640f106 and bec1161.

📒 Files selected for processing (6)
  • tests/test_image_onnx_embeddings.py (2 hunks)
  • tests/test_late_interaction_embeddings.py (2 hunks)
  • tests/test_sparse_embeddings.py (1 hunks)
  • tests/test_text_cross_encoder.py (2 hunks)
  • tests/test_text_multitask_embeddings.py (4 hunks)
  • tests/test_text_onnx_embeddings.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
  • tests/test_sparse_embeddings.py
  • tests/test_image_onnx_embeddings.py
  • tests/test_text_onnx_embeddings.py
  • tests/test_text_multitask_embeddings.py
⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on windows-latest test
  • GitHub Check: Python 3.12.x on macos-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on windows-latest test
  • GitHub Check: Python 3.11.x on macos-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on windows-latest test
  • GitHub Check: Python 3.10.x on macos-latest test
  • GitHub Check: Python 3.13 test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (8)
tests/test_text_cross_encoder.py (4)

21-28: Smart optimization of test model selection based on execution context

This change appropriately adapts the testing scope based on whether tests are running in CI or were manually triggered. Using environment variables to control test behavior helps reduce CI run time while still allowing comprehensive testing when needed.


30-53: Well-structured model testing loop with appropriate filtering conditions

The loop intelligently filters models based on size and presence in canonical values, which prevents unnecessary testing of large models in non-CI environments. The testing logic inside the loop is comprehensive and maintains the same level of validation as before.


58-58: Simplified test parameterization to improve CI speed

Focusing on a single representative model for batch reranking tests instead of testing all models is a good optimization that will help speed up CI without significantly reducing test coverage.


104-104: Consistent parameterization approach for parallel testing

The same parameterization approach is applied here, maintaining consistency across the test suite while optimizing test execution time.

tests/test_late_interaction_embeddings.py (4)

156-170: Improved test structure with parameterization

Converting the batch embedding test to use parameterization improves code organization and makes the test more maintainable. The focused testing on a single model helps to speed up CI while still verifying critical functionality.


175-199: Smart conditional testing strategy for single embeddings

This implementation follows the same pattern as in the text cross encoder tests, applying conditional execution based on test context. The approach helps optimize test execution while still enabling thorough testing when needed.


231-238: Enhanced test flexibility with multi-parameter parameterization

Parameterizing both the token dimension and model name makes the test more configurable and easier to maintain. This approach allows for greater flexibility in future updates.


259-259: Consistent model selection across test suite

Updating the model used for lazy loading tests maintains consistency with other tests in the file, which is a good practice for test maintainability.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between bec1161 and 6f35475.

📒 Files selected for processing (1)
  • tests/test_late_interaction_embeddings.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on windows-latest test
  • GitHub Check: Python 3.12.x on macos-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on windows-latest test
  • GitHub Check: Python 3.11.x on macos-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on windows-latest test
  • GitHub Check: Python 3.10.x on macos-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (3)
tests/test_late_interaction_embeddings.py (3)

156-171: Nice parameterization of the test_batch_embedding function!

The change to a parameterized test makes the code more maintainable, aligning with the overall PR objective of speeding up CI by focusing on a specific model.


231-235: Good parameterization of test_parallel_processing

The parameterization with both token_dim and model_name makes the test more flexible and explicit about its requirements.


259-260: Consistent model selection across tests

Good change to use the same model as other tests, promoting consistency across the test suite.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/test_text_cross_encoder.py (1)

24-26: Add a default value to the next function to handle potential exceptions

Using next() without a default value could raise a StopIteration exception if no model matches the condition.

-        [next(model for model in all_models if model.model in CANONICAL_SCORE_VALUES)]
+        [next((model for model in all_models if model.model in CANONICAL_SCORE_VALUES), all_models[0])]
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6f35475 and 4d3b2d9.

📒 Files selected for processing (1)
  • tests/test_text_cross_encoder.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on windows-latest test
  • GitHub Check: Python 3.12.x on macos-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on windows-latest test
  • GitHub Check: Python 3.11.x on macos-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on windows-latest test
  • GitHub Check: Python 3.10.x on macos-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test
🔇 Additional comments (4)
tests/test_text_cross_encoder.py (4)

19-28: Smart model selection based on execution context

The refactoring of test_rerank to dynamically select models based on execution context is a good approach for speeding up CI. The function now intelligently chooses between testing a single model (for CI) or all models (for manual runs).


30-52: Effective model filtering and testing strategy

The loop that tests each model implements good filtering logic based on model size and availability of canonical scores. The consistency checks between different methods and against canonical values ensure the reliability of the tests.


55-58: Simplified test parametrization for CI speed improvement

Reducing the test matrix to a single model for test_batch_rerank is an effective way to speed up CI while maintaining essential test coverage.


101-104: Consistent parameterization approach across test functions

The parameterization of test_rerank_pairs_parallel is consistent with other test functions, which is good for maintainability.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
tests/test_image_onnx_embeddings.py (1)

51-55: Consider adding a comment explaining the conditional logic

The conditional logic is sophisticated but could benefit from a comment explaining why large models are skipped in non-CI environments and why models not in CANONICAL_VECTOR_VALUES are skipped.

+ # Skip large models in non-CI environments to speed up local testing
+ # Skip models without canonical vectors for verification
  for model_desc in models_to_test:
      if (
          not is_ci and model_desc.size_in_GB > 1
      ) or model_desc.model not in CANONICAL_VECTOR_VALUES:
          continue
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between da8f392 and 671b874.

📒 Files selected for processing (1)
  • tests/test_image_onnx_embeddings.py (2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on windows-latest test
  • GitHub Check: Python 3.12.x on macos-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on windows-latest test
  • GitHub Check: Python 3.11.x on macos-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on windows-latest test
  • GitHub Check: Python 3.10.x on macos-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test
🔇 Additional comments (6)
tests/test_image_onnx_embeddings.py (6)

29-31: Nice optimization to identify the smallest model for tests

This change helps speed up CI by selecting the smallest model based on file size, which is a good strategy for reducing test execution time.


33-40: Good use of parameterization with fallback mechanism

The parameterization approach with a fallback to a known working model is well-designed. It ensures the test will run even if the smallest model isn't part of the canonical set.


41-41: Appropriate method signature update for parameterization

The method signature has been correctly updated to accommodate the new parameterized approach.


43-49: Good conditional test execution based on trigger type

Using the workflow trigger type to determine test scope is a smart approach. This ensures comprehensive testing during manual runs while keeping CI runs efficient.


97-98: Good additional validation for consistency

This assertion ensures that different input formats (file path vs. PIL Image) produce consistent embeddings, strengthening the test.


102-102: Important enhancement to validate against canonical vectors

This addition ensures that the embeddings match the expected canonical values, which is crucial for maintaining compatibility across changes.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (13)
tests/test_image_onnx_embeddings.py (3)

29-29: Consider clarifying or refactoring the global constant.
ALL_IMAGE_MODEL_DESC is assigned from a class method. If this variable needs to be used elsewhere, consider centralizing it in a dedicated config or helper file to maintain consistency across tests.


32-41: Consolidate environment-driven logic.
Using is_manual to conditionally select between all models and a single one is clear. However, you may want a shared helper method (e.g., get_models_to_test(is_manual, ALL_MODELS, default_model_name)) to reduce duplication in other tests, improving maintainability.


89-94: Strengthen checks on array closeness.
Verifying if embeddings[1] and embeddings[2] are close is good, but consider confirming that these images differ enough to justify identical embeddings. If they're meant to be identical images, a short code comment clarifying this intention would help future maintainers.

tests/test_sparse_embeddings.py (3)

49-49: Centralize model listing logic.
ALL_SPARSE_MODEL_DESC mirrors similar variables in other files. Consider placing these lists in a common utility or config if many test suites share the pattern, improving consistency.


69-78: Smooth environment-based parameterization.
The logic for switching between a single model_name and multiple models for is_manual is a neat approach. You could unify it by introducing a helper function that returns a list of testable models based on environment variables, to avoid duplication across files.


89-89: Recommend clarifying usage of expected values.
The code fetches expected_result for each model from a dictionary. Consider referencing or documenting how these canonical values were generated. This detail can be helpful for debugging if test outputs deviate from expected.

tests/test_text_onnx_embeddings.py (3)

74-74: Potentially unify descriptors across embedding types.
ALL_TEXT_MODEL_DESC is yet another descriptor list pulled from _list_supported_models(). If you expect multiple embedding categories to share or overlap in tested models, a global registry or a single source of truth might reduce duplication.


114-114: Simplify multi-dimension test list if needed.
Currently, you're only testing (384, "BAAI/bge-small-en-v1.5"). If you plan to handle multiple dimension-model pairs again, parameterizing them in a single place (e.g., a global dictionary) may be beneficial.


146-146: Encourage verifying concurrency.
The shape check confirms correct dimensions. Consider a direct correctness check of the parallel embeddings to ensure they are identical to the sequential run (you do this with np.allclose, which is good). Possibly add a comment clarifying concurrency reliability.

tests/test_text_cross_encoder.py (4)

18-18: Keep the all-model descriptor in sync.
ALL_RERANK_MODEL_DESC follows the same pattern across test suites. If additional test code references these descriptors, you might unify them in a single library for easy maintenance.


32-35: Large-model skip thresholds.
Like in the other test files, consider factoring out the numeric threshold for better maintainability and to avoid drift across different test suites.


39-47: Enhance test coverage with additional queries.
Currently, the test uses only one query: "What is the capital of France?". Adding more varied queries (multiple contexts, different complexities) might catch edge cases in the ranking logic.


49-52: Validation against canonical scores is a strong check.
Comparing the reranker output with known reference values is a reliable approach. Just remember to periodically refresh or cross-check these expected values if upstream models are updated.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 671b874 and 44644e7.

📒 Files selected for processing (6)
  • tests/test_image_onnx_embeddings.py (2 hunks)
  • tests/test_late_interaction_embeddings.py (2 hunks)
  • tests/test_sparse_embeddings.py (3 hunks)
  • tests/test_text_cross_encoder.py (3 hunks)
  • tests/test_text_multitask_embeddings.py (5 hunks)
  • tests/test_text_onnx_embeddings.py (4 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on windows-latest test
  • GitHub Check: Python 3.12.x on macos-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on windows-latest test
  • GitHub Check: Python 3.11.x on macos-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on windows-latest test
  • GitHub Check: Python 3.10.x on macos-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (21)
tests/test_image_onnx_embeddings.py (1)

75-76: Single-model parameter test is good for quick validation.
Testing one model with specific dimensions accelerates feedback in CI. Pair it with the environment-driven testing approach if you plan to cover multiple models in manual mode.

tests/test_sparse_embeddings.py (2)

80-83: Flexible skip condition for large models.
Like in other tests, you skip large models for local runs. Defining a single constant or utility method for this threshold could prevent duplication and potential misalignment.


191-191: Good addition for testing lazy loading.
Parameterizing the test_lazy_load function with "prithivida/Splade_PP_en_v1" ensures coverage for a single known-good model. This is consistent with your other environment-based tests.

tests/test_text_onnx_embeddings.py (2)

77-87: Introducing parameterization for fine-grained model testing.
Switching to a single "BAAI/bge-small-en-v1.5" model by default is aligned with your environment-based approach. This pattern improves test clarity, but watch out for code duplication across test files (e.g., test_image_onnx_embeddings.py).


123-123: Good use of dynamic shape assertions.
Replacing a fixed numerical shape with len(docs) ensures the test scales with different input sizes. This is a solid improvement for test flexibility.

tests/test_text_cross_encoder.py (2)

21-30: Context-aware model selection is consistent.
Parametrizing model_name and conditionally switching to multiple models is a helpful approach, matching the other tests’ patterns. No immediate issues, just ensure it’s documented in CONTRIBUTING.md or similar if you want new tests to follow suit.


57-57: Good practice to batch test.
Batch-oriented tests for the reranker help ensure consistent performance and correctness at scale. The structure here is straightforward and in line with other parallel tests.

tests/test_late_interaction_embeddings.py (7)

156-157: Good use of parameterization for CI optimization.

The test is now parameterized to use a specific, smaller model instead of iterating through all models. This will help speed up CI runs while still providing adequate test coverage.


173-184: Improved model selection logic that addresses previous issues.

The new implementation properly handles model selection based on execution context:

  • Uses only one specific model in CI
  • Uses all models in manual runs
  • Fixed previous issue with potential StopIteration error by using list comprehension

186-192: Good size-based filtering for model selection.

The code now properly skips large models (>1GB) in non-CI environments and only tests models that have canonical values, which helps prevent test failures while maintaining good coverage.


203-214: Consistent implementation with test_single_embedding.

This implementation follows the same pattern as the single_embedding test, with proper model selection logic based on execution context.


216-221: Well-structured conditional testing based on model properties.

The code correctly checks both size constraints and canonical value availability before attempting to run tests, which helps prevent errors during test execution.


233-236: Good parameterization of token_dim and model_name.

Using parameterization for both the token dimension and model name provides clear test configuration and makes it easier to maintain or extend in the future.


256-257: Consistent model selection across tests.

Using the smaller model "answerdotai/answerai-colbert-small-v1" for all tests ensures consistency and faster CI execution.

tests/test_text_multitask_embeddings.py (7)

63-64: Well-structured parameterization for dimension and model.

Using explicit parameterization for both dimension and model name makes the test more maintainable and transparent about what's being tested.


85-95: Effective model selection logic based on execution context.

Similar to the late interaction tests, this implementation properly selects models based on whether the test is running in CI or manually, which helps optimize test execution time.


97-101: Good filtering logic for model selection.

The code appropriately filters out large models in non-CI environments and ensures models have canonical values before testing, preventing potential test failures.


125-136: Consistent implementation across test functions.

The query embedding test follows the same pattern as other tests in this file, maintaining consistency in how models are selected and tests are parameterized.


165-176: Parallel structure with other test functions.

This test maintains the same pattern as the other embedding tests, showing good consistency in the implementation approach.


205-206: Well-parameterized test for parallel processing.

Using explicit dimension and model name parameters makes the test configuration clear and maintainable.


251-252: Consistent approach to lazy_load testing.

The lazy_load test now follows the same parameterization pattern as other tests, maintaining consistency across the test suite.

Comment on lines 43 to 46
for model_desc in models_to_test:
if (
not is_ci and model_desc.size_in_GB > 1
) or model_desc.model not in CANONICAL_VECTOR_VALUES:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Validate large-model skipping threshold.
The code skips large models for local runs if they exceed 1GB. Consider making 1GB a named constant (e.g., MAX_LOCAL_MODEL_SIZE) for better clarity and easier tuning in the future.

Would you like me to provide a quick script searching for other numeric size constraints in the codebase?


🏁 Script executed:

#!/bin/bash
# Searching for numeric thresholds used in skip conditions
rg 'size_in_GB\s*[><=]=?\s*\d+'

Length of output: 4174


Action Required: Extract 1GB Threshold into a Named Constant

The code currently hardcodes the value for skipping tests based on model size (using > 1 as the threshold). Our search confirms this literal is used across various tests (e.g., in tests/test_image_onnx_embeddings.py, tests/test_late_interaction_embeddings.py, and others). For clarity and maintainability, please extract the 1 (representing 1GB) into a named constant (e.g., MAX_LOCAL_MODEL_SIZE) so that any future adjustments can be made in one place.

  • Affected files include:
    • tests/test_image_onnx_embeddings.py
    • tests/test_late_interaction_embeddings.py
    • tests/test_sparse_embeddings.py
    • tests/test_text_multitask_embeddings.py
    • tests/test_text_onnx_embeddings.py
    • tests/test_text_cross_encoder.py

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
.github/workflows/python-tests.yml (2)

7-7: Fix trailing whitespace.

There are trailing spaces at the end of this line.

-  workflow_dispatch:
-  
+  workflow_dispatch:
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 7-7: trailing spaces

(trailing-spaces)


46-46: Fix missing newline at end of file and trailing whitespace.

The file should end with a newline character and the trailing spaces should be removed.

-          poetry run pytest 
+          poetry run pytest
🧰 Tools
🪛 YAMLlint (1.35.1)

[error] 46-46: no new line character at the end of file

(new-line-at-end-of-file)


[error] 46-46: trailing spaces

(trailing-spaces)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 44644e7 and c3a03d3.

📒 Files selected for processing (7)
  • .github/workflows/python-tests.yml (2 hunks)
  • tests/test_image_onnx_embeddings.py (2 hunks)
  • tests/test_late_interaction_embeddings.py (2 hunks)
  • tests/test_sparse_embeddings.py (3 hunks)
  • tests/test_text_cross_encoder.py (3 hunks)
  • tests/test_text_multitask_embeddings.py (5 hunks)
  • tests/test_text_onnx_embeddings.py (4 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_text_multitask_embeddings.py
🧰 Additional context used
🪛 YAMLlint (1.35.1)
.github/workflows/python-tests.yml

[error] 7-7: trailing spaces

(trailing-spaces)


[error] 46-46: no new line character at the end of file

(new-line-at-end-of-file)


[error] 46-46: trailing spaces

(trailing-spaces)

⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on windows-latest test
  • GitHub Check: Python 3.12.x on macos-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on windows-latest test
  • GitHub Check: Python 3.11.x on macos-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on windows-latest test
  • GitHub Check: Python 3.13 test
  • GitHub Check: Python 3.10.x on macos-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (21)
.github/workflows/python-tests.yml (1)

3-6: Improved CI efficiency by optimizing workflow triggers.

The workflow has been modified to:

  1. Run only on pull requests to specified branches
  2. Allow manual triggering via workflow_dispatch

This change will reduce unnecessary CI runs and provide flexibility for manual testing.

tests/test_image_onnx_embeddings.py (3)

30-31: Enhanced test flexibility with parameterization.

The test has been updated to use parameterization, allowing for selective testing of specific models.


33-45: Added conditional logic to optimize test execution.

This change intelligently determines which models to test based on the execution context:

  • For manual runs via workflow_dispatch, it tests all models in CANONICAL_VECTOR_VALUES
  • For regular CI runs, it only tests the specifically parameterized model

This significantly speeds up CI by reducing the number of models tested during regular runs.


86-91: Added improved assertions to verify embedding quality.

Added assertions to ensure:

  1. Embeddings are consistent across different input formats (comparing indices 1 and 2)
  2. The first embedding matches expected canonical vector values

This enhances test reliability while maintaining CI speed.

tests/test_sparse_embeddings.py (4)

11-11: Fixed model name typo.

Corrected "prithvida/Splade_PP_en_v1" to "prithivida/Splade_PP_en_v1".


67-71: Parameterized test for selective model testing.

Added parameterization to allow testing specific models during regular CI runs while preserving the ability to check if running in a manual workflow.


72-82: Implemented conditional testing logic for CI optimization.

This conditional logic:

  1. Skips large models (>1GB) in local environments
  2. For manual runs (workflow_dispatch), tests all models in CANONICAL_COLUMN_VALUES
  3. For regular CI runs, only tests the specified model_name

This approach significantly speeds up CI by reducing the test surface while maintaining comprehensive coverage for manual verification.


87-87: Dynamic test verification using model-specific expected results.

Fetches the expected results dynamically based on the current model being tested.

tests/test_text_onnx_embeddings.py (5)

75-80: Parameterized test for selective model execution.

Similar to other test files, this change:

  1. Parameterizes the test to accept a specific model_name
  2. Adds detection for manual execution via workflow_dispatch

This optimization allows targeted testing during CI while preserving comprehensive testing capabilities.


85-95: Added sophisticated model selection logic.

This conditional block intelligently selects models to test:

  1. For manual runs, it tests models in CANONICAL_VECTOR_VALUES with exceptions for:
    • Models in MULTI_TASK_MODELS
    • Specific models on macOS platforms
  2. For regular CI runs, it only tests the explicitly specified model_name

This approach effectively balances comprehensive coverage with CI speed.


112-129: Simplified test parameterization for faster CI.

Removed the second model from parameterization in test_batch_embedding and test_parallel_processing, focusing testing on a single model to speed up CI.


121-121: Used dynamic test assertion for better maintainability.

Changed from hardcoded assertion to dynamic length check using len(docs), making the test more maintainable and accurate.


144-144: Used dynamic test assertion for better maintainability.

Changed from hardcoded assertion to dynamic length check using len(docs), making the test more maintainable and accurate.

tests/test_text_cross_encoder.py (4)

19-19: Simplified test parameterization for CI speed improvement.

The parameterization has been streamlined to only use a single small model for CI runs instead of testing all supported models. This change aligns with the PR objective to speed up CI.


22-33: Improved test flexibility with environment-based model selection.

The new implementation intelligently selects which models to test based on the execution context:

  • Skips models larger than 1GB in non-CI environments to avoid lengthy downloads
  • Tests all models in CANONICAL_SCORE_VALUES during manual workflow runs
  • Only tests the parameterized model during normal CI runs

This approach balances comprehensive testing with CI performance.


55-55: Consistent model specification across all test functions.

All test functions now use the same small model "Xenova/ms-marco-MiniLM-L-6-v2" consistently, which reduces test time and resource usage while maintaining adequate test coverage.

Also applies to: 81-81, 95-95


35-52:

❓ Verification inconclusive

Verify model reuse and assertion reliability.

The test now creates a model instance before the assertions, but the model_name parameter may not match the model_desc.model that's being tested in the manual workflow case.


🏁 Script executed:

#!/bin/bash
# Check if the model instantiation is using the correct model name
grep -A 5 "model = TextCrossEncoder" tests/test_text_cross_encoder.py

Length of output: 1352


Action: Verify Model Name Consistency Across Workflows

The tests instantiate the TextCrossEncoder using the variable model_name in all cases (including lazy loading and batch processing). However, please ensure that the model_name provided in these tests is consistent with the model reference (typically model_desc.model) used in manual workflows. Double-check that the source of model_name is correct so that the instantiated model aligns with what is expected in production manually.

tests/test_late_interaction_embeddings.py (4)

156-157: Standardized model selection for consistent testing.

All test functions now use the same small model "answerdotai/answerai-colbert-small-v1" consistently for CI runs, which is a good approach to reduce test time while ensuring adequate coverage.

Also applies to: 173-174, 201-202, 229-230, 252-253


179-189: Robust model selection logic addresses previous issues.

The implementation properly filters models based on:

  • Size constraints in non-CI environments
  • Presence in canonical values for manual workflow runs
  • Exact match for normal CI runs

This approach eliminates the previously identified StopIteration errors by avoiding the use of next() without a default value.


207-217: Consistent model filtering logic for query embeddings.

The model selection logic correctly mirrors the implementation in test_single_embedding, ensuring consistent behavior across both document and query embedding tests.


229-233: Parameter combination enhances test clarity.

The test_parallel_processing function now accepts both token_dim and model_name parameters in a tuple, making the relationship between token dimensions and models explicit and maintainable.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (4)
tests/test_text_multitask_embeddings.py (4)

89-104: Redundant model iteration with parameterized input

There's a mismatch between the parameterized approach and the internal iteration logic. The function takes a parameterized model_name but then iterates through all supported models internally, effectively ignoring the parameter except as an initial filter.

Consider either:

  1. Removing the parameterization and keeping the iteration logic, or
  2. Removing the iteration logic and using the parameterized model directly
@pytest.mark.parametrize("model_name", ["jinaai/jina-embeddings-v3"])
def test_single_embedding(model_name: str):
    is_ci = os.getenv("CI")
    is_manual = os.getenv("GITHUB_EVENT_NAME") == "workflow_dispatch"

-    for model_desc in TextEmbedding._list_supported_models():
-        if model_desc.model not in CANONICAL_VECTOR_VALUES:
-            continue
-        if not is_ci:
-            if model_desc.size_in_GB > 1:
-                continue
-        elif not is_manual:
-            continue
-
-        model_name = model_desc.model
-        dim = model_desc.dim

+    if not is_manual:
+        pytest.skip("Skipping in CI non-manual mode")
+
+    # Get model dimensions from supported models list
+    dim = next((m.dim for m in TextEmbedding._list_supported_models() 
+               if m.model == model_name), None)
+    if dim is None or model_name not in CANONICAL_VECTOR_VALUES:
+        pytest.skip(f"Model {model_name} not found or not in canonical values")

227-231: Consider making test skipping behavior more consistent

Some parameterized tests use the approach of skipping if not manual, while others like test_task_assignment still iterate through models. Consider adopting a consistent pattern across all tests for better maintainability.


251-258: Missing model validity check

Unlike other tests, this one doesn't verify if the parameterized model exists in CANONICAL_VECTOR_VALUES. For consistency, consider adding this check.

@pytest.mark.parametrize("model_name", ["jinaai/jina-embeddings-v3"])
def test_lazy_load(model_name: str):
    is_ci = os.getenv("CI")
    is_manual = os.getenv("GITHUB_EVENT_NAME") == "workflow_dispatch"

    if not is_manual:
        pytest.skip("Skipping in CI non-manual mode")
+
+   if model_name not in CANONICAL_VECTOR_VALUES:
+       pytest.skip(f"Model {model_name} not in canonical values")

63-267: Document the new testing strategy

The PR introduces a significant change in testing strategy (conditional testing based on workflow type), but lacks documentation explaining this approach. Consider adding a module-level docstring explaining:

  1. Which tests run during normal CI vs. manual workflow runs
  2. How parameterized tests are configured
  3. The purpose of the optimization

This will help maintainers understand the testing strategy at a glance.

import os

import numpy as np
import pytest

from fastembed import TextEmbedding
from fastembed.text.multitask_embedding import Task
from tests.utils import delete_model_cache

+"""
+Test module for text multitask embeddings.
+
+Testing strategy:
+- During normal CI runs, most tests are skipped to speed up the pipeline
+- During manual workflow runs (GITHUB_EVENT_NAME == "workflow_dispatch"), 
+  comprehensive tests are executed
+- Tests are parameterized with specific models and dimensions for focused testing
+
+This approach balances speed (for regular CI runs) with thorough testing
+(for manual verification).
+"""
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 643ccc9 and 8bb521a.

📒 Files selected for processing (1)
  • tests/test_text_multitask_embeddings.py (6 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on windows-latest test
  • GitHub Check: Python 3.12.x on macos-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on windows-latest test
  • GitHub Check: Python 3.11.x on macos-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on windows-latest test
  • GitHub Check: Python 3.10.x on macos-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
  • GitHub Check: Python 3.13 test
🔇 Additional comments (5)
tests/test_text_multitask_embeddings.py (5)

63-64: Good use of parametrization to speed up CI

The parametrized test with specific model and dimension values helps focus testing on a single configuration, which aligns well with the PR's goal of speeding up CI.


66-67: Smart conditional test execution based on workflow type

Using the GITHUB_EVENT_NAME environment variable to detect manual workflow triggers is a good approach for CI optimization. This allows comprehensive testing during manual runs while keeping automated CI pipelines faster.

Also applies to: 70-71


125-141: Same redundant model iteration pattern

This function has the same issue as test_single_embedding - it takes a parameterized model name but then iterates through all supported models internally.

The issue also appears in test_single_embedding_passage and should be addressed consistently across all similar test functions.


197-206: Good parameterization and conditional skipping

This test function correctly uses the parameterized values and skips only when not manually triggered, providing a cleaner implementation compared to the looping approach in other test functions.


237-239: Good validation check for model presence

The check for models in CANONICAL_VECTOR_VALUES helps avoid unnecessary test runs for models without reference data, contributing to CI speedup.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
tests/test_text_multitask_embeddings.py (2)

89-93: Consider consolidating model selection logic across test functions.

While parameterization simplifies the test function signature, the function still contains an inner loop over model descriptions, which may create confusion about which models are actually being tested.

Consider consolidating the model selection logic to either:

  1. Use purely parameterized testing without inner loops:
-@pytest.mark.parametrize("model_name", ["jinaai/jina-embeddings-v3"])
+@pytest.mark.parametrize("model_name,dim", [("jinaai/jina-embeddings-v3", 1024)])
 def test_single_embedding(model_name: str):
     is_ci = os.getenv("CI")
     is_manual = os.getenv("GITHUB_EVENT_NAME") == "workflow_dispatch"
-
-    for model_desc in JinaEmbeddingV3._list_supported_models():
-        if not is_ci:
-            if model_desc.size_in_GB > 1:
-                continue
-        elif not is_manual:
-            continue
-
-        model_name = model_desc.model
-        dim = model_desc.dim
+    
+    if is_ci and not is_manual:
+        pytest.skip("Skipping in CI non-manual mode")
  1. Or create a helper function to generate test parameters:
def get_test_models():
    models = []
    for model_desc in JinaEmbeddingV3._list_supported_models():
        if model_desc.model in CANONICAL_VECTOR_VALUES:
            models.append((model_desc.model, model_desc.dim))
    return models if not os.getenv("CI") or os.getenv("GITHUB_EVENT_NAME") == "workflow_dispatch" else [("jinaai/jina-embeddings-v3", 1024)]

94-100: Simplify conditional logic for model selection.

The nested conditional checks can be simplified for better readability.

-    for model_desc in JinaEmbeddingV3._list_supported_models():
-        if not is_ci:
-            if model_desc.size_in_GB > 1:
-                continue
-        elif not is_manual:
-            continue
+    for model_desc in JinaEmbeddingV3._list_supported_models():
+        # Skip large models in local testing or non-manual CI runs
+        if (not is_ci and model_desc.size_in_GB > 1) or (is_ci and not is_manual):
+            continue
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 8bb521a and 4092ba6.

📒 Files selected for processing (1)
  • tests/test_text_multitask_embeddings.py (7 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on windows-latest test
  • GitHub Check: Python 3.12.x on macos-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on windows-latest test
  • GitHub Check: Python 3.11.x on macos-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on windows-latest test
  • GitHub Check: Python 3.10.x on macos-latest test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.13 test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (8)
tests/test_text_multitask_embeddings.py (8)

7-7: Good addition of Task import.

The explicit import of Task from fastembed.text.multitask_embedding improves code clarity by making it clear where the enum is defined.


63-64: Well-structured parameterization of test function.

Parameterizing the test with specific dimensions and model name is a good approach for CI optimization, allowing targeted testing of specific configurations while maintaining the ability to extend test coverage in the future.


66-72: Effective CI optimization with manual trigger check.

The addition of the manual trigger check (is_manual) is a good approach to optimize CI runs, skipping resource-intensive tests during automated CI runs while allowing them to be executed when manually triggered.


73-83: Simplified testing logic with direct model instantiation.

The refactored implementation now directly uses the parameterized model name, which simplifies the code and makes it more readable. The core test logic remains intact, ensuring embeddings have the correct shape and match canonical vectors.


163-168: Duplicate code pattern detected.

This is the same conditional logic pattern repeated for the third time.


191-200: Clear and consistent pattern for CI optimization.

Good use of the same pattern for controlling test execution in CI, skipping resource-intensive tests unless manually triggered. This is consistent with the approach used in other test functions.


245-252: Consistent skip pattern for optimal CI performance.

The consistent application of the skip pattern based on CI and manual trigger status ensures that resource-intensive tests are only run when explicitly requested, improving CI performance.


222-233:

❓ Verification inconclusive

Ensure testing coverage for all models in CANONICAL_VECTOR_VALUES.

The test now skips models that are not in CANONICAL_VECTOR_VALUES, which is appropriate, but ensure that all models that should have canonical vectors are included in that dictionary.


🏁 Script executed:

#!/bin/bash
# Verify all models returned by _list_supported_models have entries in CANONICAL_VECTOR_VALUES

# List models in the test file
echo "Models in CANONICAL_VECTOR_VALUES:"
grep -A 1 -n "_VECTOR_VALUES = {" tests/test_text_multitask_embeddings.py | grep -A 10 -B 2 ":" | grep -oE '"[^"]+"'

# List models from the JinaEmbeddingV3 class
echo -e "\nModels from JinaEmbeddingV3._list_supported_models():"
grep -n "_list_supported_models" -A 30 fastembed/text/multitask_embedding.py | grep -E "model.*:" | grep -oE '"[^"]+"'

Length of output: 465


Manual Verification Required: Confirm Consistency Between Supported Models and Canonical Vectors

The test now correctly skips models not covered by the canonical dictionary; however, based on the manual grep output, the only model detected in CANONICAL_VECTOR_VALUES is "jinaai/jina-embeddings-v3", and no model names were extracted from JinaEmbeddingV3._list_supported_models(). Please manually verify that every model in _list_supported_models() that should receive canonical vector coverage has a corresponding entry in CANONICAL_VECTOR_VALUES.

  • File: tests/test_text_multitask_embeddings.py (Lines 222–233)
  • Action: Ensure that the list of supported models (as returned by JinaEmbeddingV3._list_supported_models()) properly aligns with the models defined in CANONICAL_VECTOR_VALUES. If additional models should be included, update the dictionary accordingly.

Comment on lines 129 to 134
for model_desc in JinaEmbeddingV3._list_supported_models():
if not is_ci:
if model_desc.size_in_GB > 1:
continue
elif not is_manual:
continue
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Duplicate conditional logic across test functions.

The same conditional logic for model selection is repeated in multiple test functions, which could lead to maintenance issues.

Consider extracting this common logic into a helper function or fixture:

def should_test_model(model_desc, is_ci, is_manual):
    """Determine if a model should be tested based on environment and model size."""
    if not is_ci and model_desc.size_in_GB > 1:
        return False
    if is_ci and not is_manual:
        return False
    return True

Then use this in all test functions:

for model_desc in JinaEmbeddingV3._list_supported_models():
    if not should_test_model(model_desc, is_ci, is_manual):
        continue

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
tests/utils.py (1)

42-51: Well-designed test selection utility function

The new should_test_model function is a well-designed utility that centralizes the logic for determining which models to test based on the environment. The function properly handles:

  1. Skipping large models in non-CI environments
  2. Focusing tests on specific models in CI when not manually triggered
  3. Including good documentation

This will reduce test execution time while ensuring proper coverage.

However, consider extracting the hardcoded 1 (GB) threshold into a named constant as previously suggested.

+# Maximum model size in GB to test in non-CI environments
+MAX_LOCAL_MODEL_SIZE_GB = 1
+
 def should_test_model(
     model_name: str, model_desc: BaseModelDescription, is_ci: Optional[str], is_manual: bool
 ):
     """Determine if a model should be tested based on environment"""
     if not is_ci:
-        if model_desc.size_in_GB > 1:
+        if model_desc.size_in_GB > MAX_LOCAL_MODEL_SIZE_GB:
             return False
     elif not is_manual and model_desc.model != model_name:
         return False
     return True
tests/test_sparse_embeddings.py (1)

71-72: Consider using a more descriptive variable name

While is_manual works, a more descriptive name like is_workflow_dispatch might make the intent clearer.

-    is_manual = os.getenv("GITHUB_EVENT_NAME") == "workflow_dispatch"
+    is_workflow_dispatch = os.getenv("GITHUB_EVENT_NAME") == "workflow_dispatch"
tests/test_late_interaction_embeddings.py (1)

176-177: Consider using a more descriptive variable name

While is_manual works, a more descriptive name like is_workflow_dispatch might make the intent clearer.

-    is_manual = os.getenv("GITHUB_EVENT_NAME") == "workflow_dispatch"
+    is_workflow_dispatch = os.getenv("GITHUB_EVENT_NAME") == "workflow_dispatch"
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 4092ba6 and 3826294.

📒 Files selected for processing (7)
  • tests/test_image_onnx_embeddings.py (3 hunks)
  • tests/test_late_interaction_embeddings.py (3 hunks)
  • tests/test_sparse_embeddings.py (4 hunks)
  • tests/test_text_cross_encoder.py (4 hunks)
  • tests/test_text_multitask_embeddings.py (7 hunks)
  • tests/test_text_onnx_embeddings.py (5 hunks)
  • tests/utils.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_text_multitask_embeddings.py
⏰ Context from checks skipped due to timeout of 90000ms (16)
  • GitHub Check: Python 3.13.x on windows-latest test
  • GitHub Check: Python 3.13.x on macos-latest test
  • GitHub Check: Python 3.13.x on ubuntu-latest test
  • GitHub Check: Python 3.12.x on windows-latest test
  • GitHub Check: Python 3.12.x on macos-latest test
  • GitHub Check: Python 3.12.x on ubuntu-latest test
  • GitHub Check: Python 3.11.x on windows-latest test
  • GitHub Check: Python 3.11.x on macos-latest test
  • GitHub Check: Python 3.11.x on ubuntu-latest test
  • GitHub Check: Python 3.10.x on windows-latest test
  • GitHub Check: Python 3.10.x on macos-latest test
  • GitHub Check: Python 3.13 test
  • GitHub Check: Python 3.10.x on ubuntu-latest test
  • GitHub Check: Python 3.9.x on windows-latest test
  • GitHub Check: Python 3.9.x on macos-latest test
  • GitHub Check: Python 3.9.x on ubuntu-latest test
🔇 Additional comments (31)
tests/test_image_onnx_embeddings.py (6)

11-11: Good addition of utility function import

The addition of should_test_model from tests.utils is a good change that helps centralize model selection logic.


30-31: Well-structured test parameterization

The change from checking all models to using parameterization is a good improvement that gives more control over which models are tested in different environments.


33-33: Enhanced CI detection with workflow dispatch awareness

Adding workflow dispatch detection allows for better control over test execution, distinguishing between automated CI runs and manually triggered workflows.


36-37: Improved model selection logic

Using the centralized should_test_model function creates a consistent pattern across test files for determining which models to test based on environment conditions.


59-59: Good validation for embedding equivalence

Adding assertion to check that the embeddings from different file formats (path vs. opened image) are equivalent improves test coverage.


79-84: Better test validation with canonical vector verification

Good improvement to assert both that embeddings have expected shapes and that they match canonical vectors, providing more robust validation.

tests/utils.py (1)

6-8: Clean import organization

The addition of Optional and BaseModelDescription imports maintains good organization of dependencies.

tests/test_text_cross_encoder.py (5)

7-7: Good addition of utility function import

The addition of should_test_model from tests.utils is a good change that helps centralize model selection logic.


19-19: Focused model testing through parameterization

Parameterizing with a single model is a good approach to keep the CI testing focused and efficient while still verifying core functionality.


22-22: Enhanced CI detection with workflow dispatch awareness

Adding workflow dispatch detection allows for better control over test execution, distinguishing between automated CI runs and manually triggered workflows.


24-43: Improved model selection and test structure

The loop over all supported models combined with the should_test_model check creates a flexible test structure that:

  1. Works efficiently in CI by focusing on specified models
  2. Tests more models when manually triggered
  3. Centralizes the selection logic

This is a significant improvement to the test structure.


48-48: Consistent parameterization across test functions

Using consistent model selection across all test functions helps maintain a focused test suite while ensuring core functionality is verified.

Also applies to: 74-74, 88-88

tests/test_text_onnx_embeddings.py (8)

8-8: Good addition of utility function import

The addition of should_test_model from tests.utils is a good change that helps centralize model selection logic.


75-76: Well-structured test parameterization

The change from checking all models to using parameterization is a good improvement that gives more control over which models are tested in different environments.


79-79: Enhanced CI detection with workflow dispatch awareness

Adding workflow dispatch detection allows for better control over test execution, distinguishing between automated CI runs and manually triggered workflows.


82-87: Improved model filtering logic

The model filtering now properly checks:

  1. If the model is in the multi-task models list that should be skipped
  2. If running on macOS with a specific model that might have issues
  3. Applies the centralized should_test_model function

This creates a more robust and maintainable test filtering system.


100-100: Simplified error message formatting

The change from model_desc["model"] to model_desc.model simplifies error reporting and is more consistent with the object-oriented access pattern.


105-105: Focused model testing through parameterization

Reducing the list of parameterized models from two to one for test_batch_embedding and test_parallel_processing helps speed up CI execution while still testing core functionality.

Also applies to: 119-119


114-114: Dynamic test assertions

Replacing hardcoded shape values (200) with dynamic expressions (len(docs)) improves test maintenance and reduces the chance of errors when changing test data.

Also applies to: 134-134


142-142: Consistent parameterization for lazy loading tests

Adding parameterization to the test_lazy_load function creates consistency with other test functions and enables better control over which models are tested.

tests/test_sparse_embeddings.py (5)

8-8: Good addition of utility function dependency

The import of should_test_model from tests.utils is a good approach for centralizing the test eligibility logic across all test files, promoting code reuse and consistency.


11-11: Fixed model name typo in dictionary key

Good catch - fixing the model name from "prithvida/Splade_PP_en_v1" to "prithivida/Splade_PP_en_v1". This ensures proper model resolution during testing.


52-65: Improved test structure with parameterization

The move to a parameterized test structure is excellent. This approach:

  1. Makes the test more maintainable
  2. Allows for selective test execution
  3. Enables clearer test coverage metrics
  4. Simplifies the test implementation

The is_ci check for cleanup is also good practice to ensure resources are properly released in CI environments.


68-90: Enhanced test flexibility with dynamic model selection

The implementation of the should_test_model check along with environment detection is excellent. This intelligent test filtering:

  1. Reduces CI runtime by testing only specific models
  2. Provides a way to run more comprehensive tests in manual mode
  3. Avoids downloading large models unnecessarily
  4. Ensures tests are skipped gracefully when appropriate

93-121: Good refactoring of parallel processing test

The parameterization of this test is consistent with the overall approach in the file. Using a specific model for the test rather than a hardcoded string improves maintainability.

tests/test_late_interaction_embeddings.py (6)

9-9: Good addition of utility function dependency

The import of should_test_model from tests.utils is a good approach for centralizing the test eligibility logic across all test files.


156-170: Improved test structure with parameterization

The conversion to parameterized tests with a specific, smaller model ("answerdotai/answerai-colbert-small-v1") addresses previous issues where large models were being tested in CI. This is likely to significantly speed up the CI process.


179-182: Improved model selection logic

The should_test_model function elegantly replaces the previous complex logic that could lead to StopIteration errors. This addresses the issues raised in previous review comments while making the code more readable and maintainable.


200-203: Consistent model selection approach

Good job applying the same model selection pattern consistently across different test functions. This consistency makes the codebase more maintainable.


215-219: Improved parameterization with multiple parameters

Good approach using multiple parameters in the parametrize decorator to include both token_dim and model_name. This allows for more flexible test configurations while keeping the test focused on specific models that work well in CI.


238-238: Consistent model selection across tests

Using the same model for all tests ensures consistent behavior and avoids downloading multiple models during test execution.

@hh-space-invader hh-space-invader merged commit 6acfb00 into main Mar 6, 2025
22 checks passed
@hh-space-invader hh-space-invader deleted the speedup-ci branch March 6, 2025 02:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants