Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve uniform ShortByteString #116

Merged
merged 4 commits into from
Sep 2, 2021
Merged

Conversation

lehins
Copy link
Contributor

@lehins lehins commented Aug 31, 2021

random-1.2.0 contained a shortcut where architecture independent generation of bytes relied on bytestring's builder functionality, which forced us to generate ShortByteString as pinned. This PR fixes that technical debt.

This is a non-breaking change.

lehins added 3 commits August 31, 2021 19:34
Stop relying on `bytestring` for architecture independent generation
of `ShortByteString` and `ByteString`
@Bodigrim
Copy link
Contributor

Any idea how to test it on BE arch?

@lehins
Copy link
Contributor Author

lehins commented Aug 31, 2021

Any idea how to test it on BE arch?

Besides spinning up an AMD server from some cloud provider, like Hetzner for instance and running tests there, I have no other ideas.

@Bodigrim
Copy link
Contributor

Which AMD machines are big-endian? I thought all of them are little-endian, as well as modern ARM.

@Bodigrim
Copy link
Contributor

@juhp I recall you raising s390-related issues at GHC bug tracker. Do you possibly have an access to a big-endian machine to test this patch?

@lehins
Copy link
Contributor Author

lehins commented Aug 31, 2021

Oh look at that. AMD is LE too. In that case I have no clue how to get hold of BE hardware :D

@@ -105,6 +103,8 @@ import GHC.ForeignPtr
import Data.ByteString (ByteString)
#endif

#include "MachDeps.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid CPP: there is GHC.ByteOrder plus ghc-byteorder package for old GHCs:

  if impl(ghc < 8.4)
    build-depends:  ghc-byteorder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion. Done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the sake of anyone stumbling on this suggestion, this will not work. See: https://gitlab.haskell.org/ghc/ghc/-/issues/20338

@juhp
Copy link

juhp commented Sep 1, 2021

I can probably do a test build in the Fedora buildsystem.

You can see our latest release build here for example.

So you just want me to build this change? Get the testsuite to run looks rather tricky.

@juhp
Copy link

juhp commented Sep 1, 2021

Here is a scratch build: https://koji.fedoraproject.org/koji/taskinfo?taskID=74908079
(using ghc-8.10.5 and LTS 18 packages basically).

@lehins
Copy link
Contributor Author

lehins commented Sep 1, 2021

@juhp Thank you for your help. Unfortunately it is the random:spec test suite that needs to be run in order to confirm that big/little endian compatibility works as expected.

@juhp
Copy link

juhp commented Sep 2, 2021

Okay, I feared as much, perhaps I can get temporary access to a Fedora s390x instance... Otherwise in the worst case we will find out later I guess ;-)

@juhp
Copy link

juhp commented Sep 2, 2021

Good news I ran the testsuite on Fedora 34 s390x and it passed:

@lehins
Copy link
Contributor Author

lehins commented Sep 2, 2021

@juhp Awesome!!! Thank you very much for verifying this PR!!!

@lehins lehins merged commit d819629 into master Sep 2, 2021
@juhp
Copy link

juhp commented Sep 2, 2021

Sorry please wait - that was the wrong log... rechecking now

@juhp
Copy link

juhp commented Sep 2, 2021

I ran the testsuite on Fedora 34 s390x now correctly in your branch and I am afraid there was 1 test failure:

    genByteString/ShortByteString consistency:  FAIL
      test/Spec.hs:118:
      expected: [78,232,117,189]
       but got: [189,117,232,78]

(Sorry for the false confirmation earlier)

@lehins
Copy link
Contributor Author

lehins commented Sep 2, 2021

@juhp Dammit, that is unfortunate. Thank you for rechecking it!

@Bodigrim
Copy link
Contributor

Bodigrim commented Sep 3, 2021

@lehins actually, what is our goal here? To produce the same random numbers from the same seed both on LE and BE platforms? Why it does not suffice just to produce some random numbers, but not necessarily the same?

@lehins
Copy link
Contributor Author

lehins commented Sep 3, 2021

To produce the same random numbers from the same seed both on LE and BE platforms?

yes

Why it does not suffice just to produce some random numbers, but not necessarily the same?

I don't follow.

@lehins
Copy link
Contributor Author

lehins commented Sep 3, 2021

@Bodigrim Not random numbers, but sequence of random bytes

@lehins
Copy link
Contributor Author

lehins commented Sep 3, 2021

@Bodigrim I'll describe what is going on in a little more detail:

In order to to generate a ByteString (or a ShortByteString) we could do something like genByteStringM n g = pack <$> replicateM n (uniformM g)

However this would generate 64bits for every byte that will be used, which is extremely wasteful and inefficient.

What we do instead is generate one Word64 at a time and write into a mutable buffer until we fill it up. Writing it in BE/LE agnostic manner will ensure that generated ByteString will be the same for all architectures for the same generator.

There is also an extra issue at the end of a ByteString as well, since we often will have a tail that is smaller than Word64 (when mod n 8 /= 0) we need to write the first few bytes into the end of the ByteString in the same manner across architectures as well.

So the failing test in this #116 (comment) depicts that there is a problem in the logic (or in my assumptions of how it works) somewhere and we will get bytes in different order on BE vs LE machines.

Now, all I need is to figure out how can I get hands on BE machine so I can experiment with this, I can't be constantly bugging Jens to verify if a change works or not. I suspect the problem was there prior to this PR as well, except the test was not present until now and if anyone would ever run random on a BE machine random bytes would be still ... random, so this problem is not very well pronounced, nevertheless it is still there.

@curiousleo
Copy link
Contributor

Now, all I need is to figure out how can I get hands on BE machine so I can experiment with this, I can't be constantly bugging Jens to verify if a change works or not.

It looks like this project lets you run an emulated s390x Ubuntu with QEMU + Docker:

$ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
$ docker run --rm -t s390x/ubuntu uname -m
s390x

https://github.com/multiarch/qemu-user-static#getting-started

That might help.

@lehins
Copy link
Contributor Author

lehins commented Sep 6, 2021

Halleluja! I am not going crazy! @curiousleo enormous thank you for this suggestion with docker. It was a bit painful to get it to work, anything complicated like cabal or stack will not work on that docker, because it uses up memory like crazy.

So it appears my sanity is fine and GHC is not reporting ByteOrder correctly: https://gitlab.haskell.org/ghc/ghc/-/issues/20338

@Bodigrim thank you for suggesting to avoid CPP otherwise we would not have found that bug, but now I need to bring the CPP back.

@juhp thank you again for helping debug this. If you don't mind I'll ask you again in a little bit to run the test suite one last time, just to be sure. For now I'll need to bring back the CPP approach first.

@Bodigrim
Copy link
Contributor

Bodigrim commented Sep 6, 2021

GHC is not reporting ByteOrder correctly

Oh, that's pretty big. Thanks for debugging it.

@curiousleo
Copy link
Contributor

Damn, nice find @lehins.

@Bodigrim
Copy link
Contributor

Bodigrim commented Sep 7, 2021

It was a bit painful to get it to work, anything complicated like cabal or stack will not work on that docker, because it uses up memory like crazy.

I was able to use cabal -j1. It eats up to 16Gb RAM and is slow as hell, literally hours and hours to build dependencies, but succeeds.

@lehins
Copy link
Contributor Author

lehins commented Sep 8, 2021

😄 As I said, painful indeed

@lehins lehins deleted the improve-uniform-shortbytestring branch September 8, 2021 22:52
@Bodigrim Bodigrim mentioned this pull request Sep 18, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants