-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Take Bazel package names into account when generating GHC package names #222
Conversation
Note for @judah: ghc package names do leak to the user, so if we prefix them and if that requires some encoding, then we need to choose an encoding that's as readable as reasonably possible. These names at least leak in the Haddock HTML indexes, possibly elsewhere. If this weren't the case, any encoding no matter how ugly would do. After "sleeping on it", I think it's worth considering these alternative nuances to the bikeshed's colour:
If we can hide these names to the user in all instances, this all becomes moot. |
For the Haddock Package names can also appear in some error messages, but I don't know a good way around that. |
haskell/actions.bzl
Outdated
Returns: | ||
string: GHC package name to use. | ||
""" | ||
return "{0}0R{1}0D{2}".format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to @mboes' suggestion for dropping 0R/0D
.
I also don't have a strong preference between 0
, Z
or something else as the escape character.
Slight difference: where GHC has both `z` and `Z` escape characters (since they have two namespaces), we use `Z` only. Also get rid of `0R` and `0D`, which are now escaped like any other slash.
This makes the encoding slightly shorter in the common case.
As of tweag#222, both the package name and ID are z-encoded. However, GHC only needs the package *ID* to be unique, as long as you call `-package-id` instead of `-package`. As a result we can simplify how package-names are encoded: we can keep them the same as the rule name, other than fixing underscores. In addition to making error messages nicer, this paves the way for some improvements in Hazel (or similar tools), since we can influence - It helps support the PackageImports extension. - GHC >= 8.0 generates the Cabal macros (e.g., `MIN_VERSION_*`) automatically. Prebuilt dependencies are still passed with `-package` since their IDs
As of #222, both the package name and ID are z-encoded. However, GHC only needs the package *ID* to be unique, as long as you call `-package-id` instead of `-package`. As a result we can simplify how package-names are encoded: we can keep them the same as the rule name, other than fixing underscores. In addition to making error messages nicer, this paves the way for some improvements in Hazel (or similar tools), since we can influence - It helps support the PackageImports extension. - GHC >= 8.0 generates the Cabal macros (e.g., `MIN_VERSION_*`) automatically. Prebuilt dependencies are still passed with `-package` since their IDs
As of #222, both the package name and ID are z-encoded. However, GHC only needs the package *ID* to be unique, as long as you call `-package-id` instead of `-package`. As a result we can simplify how package-names are encoded: we can keep them the same as the rule name, other than fixing underscores. In addition to (I think) making error messages nicer, this paves the way for some improvements in Hazel (or similar tools), now that the rule and package names can be identical: - It helps support the PackageImports extension. - GHC >= 8.0 generates the Cabal macros (e.g., `MIN_VERSION_*`) automatically. Prebuilt dependencies are still passed with `-package` since their IDs
As of #222, both the package name and ID are z-encoded. However, GHC only needs the package *ID* to be unique, as long as you call `-package-id` instead of `-package`. As a result we can simplify how package-names are encoded: we can keep them the same as the rule name, other than fixing underscores. In addition to (I think) making error messages nicer, this paves the way for some improvements in Hazel (or similar tools), now that the rule and package names can be identical: - It helps support the PackageImports extension. - GHC >= 8.0 generates the Cabal macros (e.g., `MIN_VERSION_*`) automatically. Prebuilt dependencies are still passed with `-package` since their IDs
As of #222, both the package name and ID are z-encoded. However, GHC only needs the package *ID* to be unique, as long as you call `-package-id` instead of `-package`. As a result we can simplify how package-names are encoded: we can keep them the same as the rule name, other than fixing underscores. In addition to (I think) making error messages nicer, this paves the way for some improvements in Hazel (or similar tools), now that the rule and package names can be identical: - It helps support the PackageImports extension. - GHC >= 8.0 generates the Cabal macros (e.g., `MIN_VERSION_*`) automatically. Prebuilt dependencies are still passed with `-package` since their IDs are internal details of GHC.
As of #222, both the package name and ID are z-encoded. However, GHC only needs the package *ID* to be unique, as long as you call `-package-id` instead of `-package`. As a result we can simplify how package-names are encoded: we can keep them the same as the rule name, other than fixing underscores. In addition to (I think) making error messages nicer, this paves the way for some improvements in Hazel (or similar tools), now that the rule and package names can be identical: - It helps support the PackageImports extension. - GHC >= 8.0 generates the Cabal macros (e.g., `MIN_VERSION_*`) automatically. Prebuilt dependencies are still passed with `-package` since their IDs are internal details of GHC.
Make package names more similar to the rule name. As of #222, both the package name and ID are z-encoded. However, GHC only needs the package *ID* to be unique, as long as you call `-package-id` instead of `-package`. As a result we can simplify how package-names are encoded: we can keep them the same as the rule name, other than fixing underscores. In addition to (I think) making error messages nicer, this paves the way for some improvements in Hazel (or similar tools), now that the rule and package names can be identical: - It helps support the PackageImports extension. - GHC >= 8.0 generates the Cabal macros (e.g., `MIN_VERSION_*`) automatically. Prebuilt dependencies are still passed with `-package` since their IDs are internal details of GHC.
It's needed by several Cabal packages; for example `lens`: https://github.com/ekmett/lens/blob/1aa37c4/src/Control/Lens/Internal/TH.hs#L102 If you don't set it, then packages like `lens` default to logic like: ``` lensPackageKey = "lens-" ++ showVersion version ``` But that doesn't work as of #222. For example the `fuzzyset` package breaks with Hazel: ``` bazel-out/k8-fastbuild/bin/external/haskell_fuzzyset/gen-srcs-fuzzyset/Data/FuzzySet/Lens.hs:17:1: error: * Failed to load interface for `Control.Lens.Type' no unit id matching `lens-4.15.4' was found (This unit ID looks like the source package ID; the real unit ID is `externalZShaskellZUlensZSlens-4.15.4') * In the type signature: _useLevenshtein :: lens-4.15.4:Control.Lens.Type.Lens' FuzzySet Bool | 17 | makeLensesFor | ^^^^^^^^^^^^^... ``` Given that `rules_haskell` has the logic that creates the id string, I think it makes sense for it to set this parameter itself (rather than manually-written rules, or macros that use it like Hazel/cabal2bazel).
It's needed by several Cabal packages; for example `lens`: https://github.com/ekmett/lens/blob/1aa37c4/src/Control/Lens/Internal/TH.hs#L102 If you don't set it, `lens` defaults to logic like: ``` lensPackageKey = "lens-" ++ showVersion version ``` But that doesn't work as of #222 which makes the package key more complicated. For example the `fuzzyset` package breaks with Hazel: ``` bazel-out/k8-fastbuild/bin/external/haskell_fuzzyset/gen-srcs-fuzzyset/Data/FuzzySet/Lens.hs:17:1: error: * Failed to load interface for `Control.Lens.Type' no unit id matching `lens-4.15.4' was found (This unit ID looks like the source package ID; the real unit ID is `externalZShaskellZUlensZSlens-4.15.4') * In the type signature: _useLevenshtein :: lens-4.15.4:Control.Lens.Type.Lens' FuzzySet Bool | 17 | makeLensesFor | ^^^^^^^^^^^^^... ``` Given that `rules_haskell` has the logic that creates the id string, I think it makes sense for it to set this parameter itself (rather than manually-written rules, or macros that use it like Hazel/cabal2bazel).
Close #219.
In short, there are many possible approaches to do encoding. The key, it seems, is to take a rare character and use it as escape character remembering to escape the escape character itself to achieve injectivity. With @mboes We discussed several possibilities and I implemented an encoding scheme where escaping character is zero
0
.So we use
0R
to separate workspace root from (Bazel) package name and0D
to separate package name from target name. In addition to that we escape zero0
as00
, underscore_
as0U
and slash/
as0S
. This should work.We also discussed using dash, perhaps something symmetric like
-U-
, but then two underscores in a row would result in an invalid package name, because GHC does not like underscores or more than one dash in a row.@mboes also proposed to prefix all digits with
0
because this is less-surprising (i.e. we escape 0, and so we should escape all digits for consistency).@judah What do you think?