Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix cpuId(0) return all zero in newer CPU, for example Intel i5-9600K #1

Closed
wants to merge 2 commits into from
Closed

Conversation

azhai
Copy link

@azhai azhai commented Dec 16, 2019

No description provided.

@codecov
Copy link

codecov bot commented Dec 16, 2019

Codecov Report

Merging #1 into master will decrease coverage by 0.55%.
The diff coverage is 33.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master       #1      +/-   ##
==========================================
- Coverage    93.9%   93.35%   -0.56%     
==========================================
  Files           6        6              
  Lines         361      361              
==========================================
- Hits          339      337       -2     
- Misses         13       15       +2     
  Partials        9        9
Impacted Files Coverage Δ
encoding_amd64.go 33.33% <33.33%> (ø) ⬆️
generator.go 94.54% <0%> (-0.91%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a802c3a...a23f8ad. Read the comment docs.

@alcore alcore self-assigned this Dec 17, 2019
@alcore
Copy link
Member

alcore commented Dec 30, 2019

Thanks for the contribution and apologies for the delay - this one actually took some digging (without much luck) and is more convoluted than it seems.

I can't merge the PR as is, primarily due to the changes in a23f8ad for quite obvious reasons I hope.

I still want to fix the underlying issue, however. With what you proposed, when mfi is 0, the code simply skips to the cpuId(1) call. I am unsure what happens when cpuId(7) is called on a CPU that does not support it, which could happen now that the check is omitted. I dislike undefined behavior but do understand that this only affects quite ancient architectures.

I ran into this myself (on older gen i7s). My temporary "fix" for this was the result of just debugging and doing this locally, in encoding_amd64.go line 17:

mfi, _, _, _ := cpuId(0)
mfi, _, _, _ = cpuId(0)

To my surprise the second call to cpuId() then works as expected, returning the Highest Function Parameter in EAX (as mfi), while the first call returns all zero, as you reported.

And while this does the trick on every platform I could test, I don't understand why the first call fails to begin with, let alone why the second one does not. More peculiarly, the issue does not seem to appear in go test setups/builds, which leads me to believe it isn't architecture specific at all - and instead the issue is related to ASM calling and Go's module init logic somehow.

I'd be willing to take either approach as a temporary fix, but both smell. If we could figure out where the actual issue lies, I'd be much happier with this. This has actually been blocking me from tagging a confident 1.0 and I never got around to dig into this

@alcore alcore closed this in e318f78 Feb 28, 2020
alcore added a commit that referenced this pull request Apr 4, 2020
- Includes a proper fix for #1;
- Fixes incorrect mask checks causing false positives on CPUs without actual support for the SIMD sets we need;
- Includes new test suite covering the detection routine;

Other code (especially os/arch specific) may also get moved to the internal package to declutter the root package.
@alcore
Copy link
Member

alcore commented Apr 4, 2020

To touch back on this: d3f9e9e now includes a proper fix, over the bandaid that was applied previously.

That said, I am still in the dark as to why the initial implementation worked in test builds, let alone why the bandaid "worked".

@azhai
Copy link
Author

azhai commented Apr 9, 2020

In version 1.1.0, the bug is exists yet, but it is ok in branch master.

Maybe That is an optimization of the golang compiler.
We can do this to fixing it:

import (
	"fmt"
	"runtime"
)
var hasVectorSupport = func() bool {
	if runtime.GOOS == "windows" {
		_ = fmt.Sprint(cpuId(0)) // Disable optimization in Windows
	}
	mfi, _, _, _ := cpuId(0) // TODO(alcore) See: https://github.com/muyo/sno/pull/1
	// other code ...
}

And my CPU is:

CPU Type HexaCore Intel Core i5-9600K, 4300 MHz (43 x 100)
CPU Alias Coffee Lake-S
CPU Stepping R0
Instruction Set x86, x86-64, MMX, SSE, SSE2, SSE3, SSSE3, SSE4.1, SSE4.2, AVX, AVX2, FMA, AES
Original Clock 3700 MHz
Min / Max CPU Multiplier 8x / 83x
Engineering Sample No
L1 Code Cache 32 KB per core
L1 Data Cache 32 KB per core
L2 Cache 256 KB per core (On-Die, ECC, Full-Speed)
L3 Cache 9 MB (On-Die, ECC, Full-Speed)

@alcore
Copy link
Member

alcore commented Apr 9, 2020

Correct - I'm planning a 1.2.0 release within the next few days, which will include this. Most likely along with 1.0.1 and 1.1.1 bugfix backports.

Glad master is working for you at least. You had me worried there for a moment before the edit :)
Thanks for your time in reporting and confirming it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants