Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bcrypt large speed regression #13510

Open
Daimanta opened this issue Nov 10, 2022 · 14 comments · Fixed by #13518
Open

Bcrypt large speed regression #13510

Daimanta opened this issue Nov 10, 2022 · 14 comments · Fixed by #13518
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. frontend Tokenization, parsing, AstGen, Sema, and Liveness. optimization regression It worked in a previous version of Zig, but stopped working.
Milestone

Comments

@Daimanta
Copy link

Daimanta commented Nov 10, 2022

Zig Version

0.10.0

Steps to Reproduce and Observed Behavior

The speed of Bcrypt with equal/similar parameters as Zig 0.9.x has regressed by about a factor 10 in 0.10.0. This severely reduces the usability of the implementation as we have this now.

Code as tested:

const std2 = @import("std");
const bcrypt = std2.crypto.pwhash.bcrypt;
const time = std2.time;
const begin = time.milliTimestamp();
var params: bcrypt.Params = .{ .rounds_log = 10 };
var hash_options: bcrypt.HashOptions = .{ .allocator = std2.heap.page_allocator, .params = params, .encoding = std2.crypto.pwhash.Encoding.crypt };
var buffer: [bcrypt.hash_length]u8 = undefined;
_ = try bcrypt.strHash("password", hash_options, buffer[0..]);
const end = time.milliTimestamp();
const diff = end - begin;
std2.debug.print("{d}\n", .{diff});

This code on my machine has around the following results:
Zig 0.9: 392ms
Zig 0.10: 5128ms

Expected Behavior

The speed of the algorithm should be similar the speed in Zig 0.9.

@Daimanta Daimanta added the bug Observed behavior contradicts documented or intended behavior label Nov 10, 2022
@jedisct1 jedisct1 added the frontend Tokenization, parsing, AstGen, Sema, and Liveness. label Nov 11, 2022
@jedisct1
Copy link
Contributor

Yes, this is a regression in stage2 that probably affects more than bcrypt.

Compiling with -fstage1 gets the previous speed back, but this is obviously just a bad workaround.

I'll see if I can pinpoint where the regression is.

@jedisct1
Copy link
Contributor

jedisct1 commented Nov 11, 2022

On Apple M1:

Stage1: bcrypt:      0.017 s/ops
Stage2: bcrypt:      1.375 s/ops

That's a 80x factor.

@jedisct1 jedisct1 added optimization and removed bug Observed behavior contradicts documented or intended behavior labels Nov 11, 2022
@nektro
Copy link
Contributor

nektro commented Nov 11, 2022

state: State vs state: *const State and #12215 could be culprits

@jedisct1
Copy link
Contributor

Besides the slowdown:

❯ du -sh *.s
280K	main-stage1.s
4.1M	main-stage2.s

@jedisct1
Copy link
Contributor

state: State vs state: *const State

That indeed seems to make a massive difference!

jedisct1 added a commit to jedisct1/zig that referenced this issue Nov 11, 2022
state: State -> state: *const State
Suggested by @nektro

Fixes ziglang#13510
@jedisct1
Copy link
Contributor

Well done, @nektro !

state: *const State indeed brings us back to the stage1 speed.

jedisct1 added a commit to jedisct1/zig that referenced this issue Nov 11, 2022
state: State -> state: *const State
Suggested by @nektro

Fixes ziglang#13510
@IntegratedQuantum
Copy link
Contributor

It might be worth noting that as of yesterday related performance problems like #12638 were fixed in #13074
Have you checked with the latest master version if the performance problems still happen?

@Vexu Vexu added this to the 0.10.1 milestone Nov 11, 2022
@jedisct1
Copy link
Contributor

jedisct1 commented Nov 11, 2022

Master branch helps, but #13518 still makes a difference:

Apple M1:

master: 0.362 s/ops
master+13518: 0.017 s/ops
stage1: 0.017 s/ops

jedisct1 added a commit that referenced this issue Nov 14, 2022
state: State -> state: *const State
Suggested by @nektro

Fixes #13510
@andrewrk andrewrk modified the milestones: 0.10.1, 0.11.0 Jan 9, 2023
@andrewrk
Copy link
Member

andrewrk commented Jan 9, 2023

I'm re-opening this issue; it can be closed when 7eed028 is reverted due to fixing the compiler codegen.

@andrewrk andrewrk reopened this Jan 9, 2023
@andrewrk andrewrk added enhancement Solving this issue will likely involve adding new logic or components to the codebase. regression It worked in a previous version of Zig, but stopped working. labels Jan 9, 2023
andrewrk pushed a commit that referenced this issue Jan 9, 2023
state: State -> state: *const State
Suggested by @nektro

Fixes #13510
@Daimanta
Copy link
Author

I'm still seeing the same speed issues, although the performance has indeed improved. Running the example code I get around 1100ms for Zig 0.10.1. This is better than 0.10.0 but still about 3 times slower than 0.9.1. I know there is another issue with data passing, but I'd like to keep this issue open as this issue is still heavily impacting the usability of the algorithm.

andrewrk added a commit that referenced this issue May 31, 2023
The Zig language allows the compiler to make this optimization
automatically. We should definitely make the compiler do that, and
revert this commit. However, that will not happen in this branch, and I
want to continue to explore achieving performance parity with
merge-base. So, this commit changes all InternPool parameters to be
passed by const pointer rather than by value.

I measured a 1.03x ± 0.03 speedup vs the previous commit compiling the
(set of passing) behavior tests. Against merge-base, this commit is
1.17x ± 0.04 slower, which is an improvement from the previous
measurement of 1.22x ± 0.02.

Related issue: #13510
Related issue: #14129
Related issue: #15688
andrewrk added a commit that referenced this issue Jun 4, 2023
The Zig language allows the compiler to make this optimization
automatically. We should definitely make the compiler do that, and
revert this commit. However, that will not happen in this branch, and I
want to continue to explore achieving performance parity with
merge-base. So, this commit changes all InternPool parameters to be
passed by const pointer rather than by value.

I measured a 1.03x ± 0.03 speedup vs the previous commit compiling the
(set of passing) behavior tests. Against merge-base, this commit is
1.17x ± 0.04 slower, which is an improvement from the previous
measurement of 1.22x ± 0.02.

Related issue: #13510
Related issue: #14129
Related issue: #15688
jacobly0 pushed a commit to jacobly0/zig that referenced this issue Jun 11, 2023
The Zig language allows the compiler to make this optimization
automatically. We should definitely make the compiler do that, and
revert this commit. However, that will not happen in this branch, and I
want to continue to explore achieving performance parity with
merge-base. So, this commit changes all InternPool parameters to be
passed by const pointer rather than by value.

I measured a 1.03x ± 0.03 speedup vs the previous commit compiling the
(set of passing) behavior tests. Against merge-base, this commit is
1.17x ± 0.04 slower, which is an improvement from the previous
measurement of 1.22x ± 0.02.

Related issue: ziglang#13510
Related issue: ziglang#14129
Related issue: ziglang#15688
andrewrk added a commit that referenced this issue Jun 11, 2023
The Zig language allows the compiler to make this optimization
automatically. We should definitely make the compiler do that, and
revert this commit. However, that will not happen in this branch, and I
want to continue to explore achieving performance parity with
merge-base. So, this commit changes all InternPool parameters to be
passed by const pointer rather than by value.

I measured a 1.03x ± 0.03 speedup vs the previous commit compiling the
(set of passing) behavior tests. Against merge-base, this commit is
1.17x ± 0.04 slower, which is an improvement from the previous
measurement of 1.22x ± 0.02.

Related issue: #13510
Related issue: #14129
Related issue: #15688
@Daimanta
Copy link
Author

Running zig-0.11.0-dev.4003, I have reran the experiment. The speed I'm getting for the above code snippet is 320ms. I would consider this an acceptable speed as it is faster than the 0.9 implementation. I would consider closing this ticket as fixed.

@jedisct1
Copy link
Contributor

@Daimanta The core issue doesn't seem to have been resolved.

Reverting #13518 still causes a significant speed regression.

On a M1 Macbook:

Before: 0.014 s/ops
After : 0.364 s/ops

@jedisct1
Copy link
Contributor

See 90a877f for related issues.

@andrewrk
Copy link
Member

Possible reduction of this issue: #17580

@andrewrk andrewrk modified the milestones: 0.14.0, 0.15.0 Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Solving this issue will likely involve adding new logic or components to the codebase. frontend Tokenization, parsing, AstGen, Sema, and Liveness. optimization regression It worked in a previous version of Zig, but stopped working.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants