Skip to content

Add feature for Duplicate Key PIR #281

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 26 commits into from
Apr 23, 2025
Merged

Conversation

yunbo99
Copy link
Contributor

@yunbo99 yunbo99 commented Mar 21, 2025

No description provided.

Copy link

github-actions bot commented Mar 21, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @yunbo99, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request introduces a new feature for Duplicate Key PIR (DkPir), which enables batch queries while preserving data privacy. The changes include adding necessary dependencies, example data, configuration files, and core logic for both sender and receiver sides. The implementation extends the existing APSI framework to support scenarios where the server-side database contains multiple rows with the same key, enabling applications like billable PIR.

Highlights

  • Dependency Addition: The pull request adds heu as a bazel dependency in MODULE.bazel. This library is used for homomorphic encryption, a key component of the DkPir protocol.
  • Example Data: New CSV files (duplicate_key_db.csv, duplicate_key_query.csv, duplicate_key_target_result.csv) are added to examples/pir/apsi/data/ to demonstrate the duplicate key PIR functionality. These files contain sample data and queries for testing the new feature.
  • Configuration Files: New configuration files (dk_pir_receiver_online.json, dk_pir_sender_offline.json, dk_pir_sender_online.json) are added to examples/pir/config/ to define the settings for the DkPir sender and receiver in both online and offline modes. These configurations specify parameters like file paths, curve types, and communication settings.
  • Core Logic Implementation: The pull request introduces a new directory psi/algorithm/dkpir/ containing the core logic for the DkPir protocol. This includes implementations for the sender, receiver, OPRF receiver, and common utilities, as well as a new proto definition for secret keys.
  • CSV Conversion Utilities: The pull request adds a csv_converter library to psi/utils, providing utilities for converting CSV files into formats suitable for APSI and DkPir. This includes functionalities for merging columns and rows, extracting queries, and extracting results.

Changelog

Click here to see the changelog
  • MODULE.bazel
    • Added heu as a bazel dependency.
  • examples/pir/apsi/data/duplicate_key_db.csv
    • Added a CSV file containing example database with duplicate keys.
  • examples/pir/apsi/data/duplicate_key_query.csv
    • Added a CSV file containing example queries for duplicate keys.
  • examples/pir/apsi/data/duplicate_key_target_result.csv
    • Added a CSV file containing the expected results for the example queries.
  • examples/pir/config/dk_pir_receiver_online.json
    • Added a configuration file for the DkPir receiver in online mode.
  • examples/pir/config/dk_pir_sender_offline.json
    • Added a configuration file for the DkPir sender in offline mode.
  • examples/pir/config/dk_pir_sender_online.json
    • Added a configuration file for the DkPir sender in online mode.
  • psi/algorithm/dkpir/BUILD.bazel
    • Added a BUILD file for the dkpir algorithm, defining the necessary libraries and dependencies.
  • psi/algorithm/dkpir/README.md
    • Added a README file describing the DkPir scheme, its application scenarios, and instructions for testing.
  • psi/algorithm/dkpir/common.cc
    • Implemented common utility functions for DkPir, including polynomial computation, key saving/loading, and result writing.
  • psi/algorithm/dkpir/common.h
    • Defined common data structures and function prototypes for DkPir.
  • psi/algorithm/dkpir/encryptor.h
    • Defined a custom Elgamal encryptor without plaintext range restrictions.
  • psi/algorithm/dkpir/entry.cc
    • Implemented entry points for the DkPir sender and receiver in both offline and online modes.
  • psi/algorithm/dkpir/entry.h
    • Defined the entry point functions for the DkPir algorithm.
  • psi/algorithm/dkpir/oprf_receiver.cc
    • Implemented the OPRF receiver component for DkPir.
  • psi/algorithm/dkpir/oprf_receiver.h
    • Defined the OPRF receiver class for DkPir.
  • psi/algorithm/dkpir/query.h
    • Defined the DkPirQuery class, extending the base apsi::sender::Query to include a sender count database.
  • psi/algorithm/dkpir/receiver.cc
    • Implemented the DkPir receiver logic, including OPRF request handling and query processing.
  • psi/algorithm/dkpir/receiver.h
    • Defined the DkPir receiver class.
  • psi/algorithm/dkpir/secret_key.proto
    • Defined the protobuf message for storing secret keys.
  • psi/algorithm/dkpir/sender.cc
    • Implemented the DkPir sender logic, including OPRF response generation and query processing.
  • psi/algorithm/dkpir/sender.h
    • Defined the DkPir sender class.
  • psi/algorithm/dkpir/sender_cnt_db.cc
    • Implemented the logic for generating the sender count database, including encryption of row counts.
  • psi/algorithm/dkpir/sender_cnt_db.h
    • Defined the functions for generating the sender count database.
  • psi/algorithm/dkpir/sender_dispatcher.cc
    • Implemented the DkPir sender dispatcher, handling OPRF requests and queries.
  • psi/algorithm/dkpir/sender_dispatcher.h
    • Defined the DkPir sender dispatcher class.
  • psi/apps/psi_launcher/BUILD.bazel
    • Added //psi/algorithm/dkpir:entry as a dependency to psi_launcher.
  • psi/apps/psi_launcher/launch.cc
    • Added RunDkPir functions to handle DkPir sender and receiver configurations.
  • psi/apps/psi_launcher/launch.h
    • Added function declarations for RunDkPir.
  • psi/proto/entry.proto
    • Added DkPirSenderConfig and DkPirReceiverConfig to the LaunchConfig message.
  • psi/proto/pir.proto
    • Added DkPirSenderConfig and DkPirReceiverConfig messages, and imported psi/proto/psi.proto.
  • psi/utils/BUILD.bazel
    • Added csv_converter library and its test.
  • psi/utils/csv_converter.cc
    • Implemented the ApsiCsvConverter class for converting CSV files.
  • psi/utils/csv_converter.h
    • Defined the ApsiCsvConverter class.
  • psi/utils/csv_converter_test.cc
    • Implemented tests for the ApsiCsvConverter class.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Trivia time!

What does the acronym PIR stand for in the context of Private Information Retrieval?

Click here for the answer
PIR stands for Private Information Retrieval.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a new feature for Duplicate Key PIR (DkPir), including necessary code and configuration files. The changes seem well-structured, but there are a few points that need attention to ensure correctness and maintainability.

Summary of Findings

  • Missing License Information: The MODULE.bazel file lacks license information for the newly added dependency 'heu'. This should be added to comply with licensing requirements.
  • Potential Data Leakage: The current implementation of the DkPir scheme relies on the assumption that all client queries are present in the server's database. If this assumption is not met, the row count verification may fail, potentially leaking information about the client's query.
  • Lack of Input Validation: The code does not appear to have sufficient input validation, especially for file paths and configurations. This could lead to unexpected behavior or security vulnerabilities.

Merge Readiness

The pull request introduces a significant new feature, but there are several issues that need to be addressed before merging. Specifically, the missing license information, potential data leakage, and lack of input validation should be resolved. I am unable to approve this pull request, and recommend that it not be merged until these issues are addressed.

@yunbo99
Copy link
Contributor Author

yunbo99 commented Mar 21, 2025

I have read the CLA Document and I hereby sign the CLA

@qxzhou1010 qxzhou1010 self-requested a review March 28, 2025 02:53
Copy link
Contributor

@qxzhou1010 qxzhou1010 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

按照之前的Review结果均做了修改,算法Review这边无问题,代码本身方面我这边也暂未发现问题。请工程同学再DoubleCheck一下。

@qxzhou1010
Copy link
Contributor

LGTM

Copy link
Collaborator

@huocun-ant huocun-ant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

huocun-ant
huocun-ant previously approved these changes Apr 21, 2025
@huocun-ant huocun-ant merged commit 37e0888 into secretflow:main Apr 23, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants