Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental results differ from the paper's results #9

Open
Huangryyyy opened this issue Mar 19, 2025 · 0 comments
Open

Experimental results differ from the paper's results #9

Huangryyyy opened this issue Mar 19, 2025 · 0 comments

Comments

@Huangryyyy
Copy link

Description:

I have been attempting to reproduce the experimental results from your paper "Efficient Test-Time Adaptation of Vision-Language Models" as outlined in the repository. However, after running the experiments using the provided code and data, the results I obtained do not align with those reported in the paper.

Steps to Reproduce:

1.Clone the repository and set up the environment as per the instructions in the README.

2.Run the following commands to reproduce the results:

bash scripts/run_cd_benchmark_rn50.sh 
bash scripts/run_cd_benchmark_vit.sh 
bash scripts/run_ood_benchmark_rn50.sh
bash scripts/run_ood_benchmark_vit.sh

3.I then modified the code by removing the update_cache() and compute_cache_logits() sections in the function run_test_tda(), so that the model now operates as a zero-shot CLIP. I then re-ran the same commands.

Results:

Image

As shown in the image, "CLIPRN50/vit" refers to zero-shot CLIP. The accuracy of the zero-shot CLIP in my experiment is higher than the results reported in the paper, and it is close to the performace of "TDARN50/vit".

Could you please provide guidance on how to resolve this discrepancy?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant