Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared library fails to build with LTO enabled #5906

Open
Tachi107 opened this issue Jun 9, 2022 · 4 comments
Open

Shared library fails to build with LTO enabled #5906

Tachi107 opened this issue Jun 9, 2022 · 4 comments
Labels
bug size-m Estimated task size: medium (~1w)

Comments

@Tachi107
Copy link
Contributor

Tachi107 commented Jun 9, 2022

Summary

When enabling LTO with CMAKE_INTERPROCEDURAL_OPTIMIZATION and USE_SHARED_MBEDTLS_LIBRARY, builds fail with the following Werror:

[100%] Linking C shared library libmbedtls.so
cd /tmp/tmp.vNVCh29cpE/mbedtls/build/library && /usr/bin/cmake -E cmake_link_script CMakeFiles/mbedtls.dir/link.txt --verbose=1
/usr/bin/cc -fPIC  -Wall -Wextra -Wwrite-strings -Wformat=2 -Wno-format-nonliteral -Wvla -Wlogical-op -Wshadow -Wformat-signedness -Wformat-overflow=2 -Wformat-truncation -Werror -Wmissing-declarations -Wmissing-prototypes -flto -fno-fat-lto-objects -shared -Wl,-soname,libmbedtls.so.14 -o libmbedtls.so.2.28.0 CMakeFiles/mbedtls.dir/debug.c.o CMakeFiles/mbedtls.dir/net_sockets.c.o CMakeFiles/mbedtls.dir/ssl_cache.c.o CMakeFiles/mbedtls.dir/ssl_ciphersuites.c.o CMakeFiles/mbedtls.dir/ssl_cli.c.o CMakeFiles/mbedtls.dir/ssl_cookie.c.o CMakeFiles/mbedtls.dir/ssl_msg.c.o CMakeFiles/mbedtls.dir/ssl_srv.c.o CMakeFiles/mbedtls.dir/ssl_ticket.c.o CMakeFiles/mbedtls.dir/ssl_tls.c.o CMakeFiles/mbedtls.dir/ssl_tls13_keys.c.o  -Wl,-rpath,/tmp/tmp.vNVCh29cpE/mbedtls/build/library: libmbedx509.so.2.28.0 libmbedcrypto.so.2.28.0 
/tmp/tmp.vNVCh29cpE/mbedtls/library/ssl_tls.c: In function ‘ssl_calc_finished_tls_sha384’:
/tmp/tmp.vNVCh29cpE/mbedtls/library/ssl_tls.c:3400:5: error: ‘mbedtls_sha512_finish_ret’ accessing 64 bytes in a region of size 48 [-Werror=stringop-overflow=]
3400 |     mbedtls_sha512_finish_ret( &sha512, padbuf );
    |     ^
/tmp/tmp.vNVCh29cpE/mbedtls/library/ssl_tls.c:3400:5: note: referencing argument 2 of type ‘unsigned char *’
/tmp/tmp.vNVCh29cpE/mbedtls/include/mbedtls/sha512.h:146:5: note: in a call to function ‘mbedtls_sha512_finish_ret’
146 | int mbedtls_sha512_finish_ret( mbedtls_sha512_context *ctx,
    |     ^
lto1: all warnings being treated as errors
lto-wrapper: fatal error: /usr/bin/cc returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status

System information

Mbed TLS version (number or commit id): 2.28.0 and 7e163d7
Operating system and version: Debian Testing (12)
Configuration (if not default, please attach mbedtls_config.h): default
Compiler and options (if you used a pre-built binary, please indicate how you obtained it): GCC 11, no custom options (apart from LTO)
Additional environment information: None

Expected behavior

MbedTLS should compile fine with LTO enabled

Actual behavior

This error prevents some users from using MbedTLS with LTO enabled. Notably, Ubuntu added the mbedtls package to their LTO blacklist.

Steps to reproduce

cmake -S . -B build -DUSE_SHARED_MBEDTLS_LIBRARY=true -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=true -DCMAKE_POLICY_DEFAULT_CMP0069=NEW -DENABLE_PROGRAMS=false -DENABLE_TESTING=false -DUSE_STATIC_MBEDTLS_LIBRARY=false
cmake --build build

Note that not all the options above are required to reproduce the problem, the needed ones are -DUSE_SHARED_MBEDTLS_LIBRARY=true -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=true -DCMAKE_POLICY_DEFAULT_CMP0069=NEW. The others are there to speed up compilation by skipping the build of unrelated parts.

Additional information

Discovered when building the Debian package.

The development branch seems to not be affected.

@gilles-peskine-arm
Copy link
Contributor

Thanks for reporting this! It's a known bug and we thought we'd fixed it: with an API change in 3.0 and a crude local workaround in 2.28 LTS. So it turns out LTO defeats the local workaround ☹️

Note that it's a spurious warning: the sha512 functions can either output a 48-byte SHA-384 or a 64-byte SHA-512. In this case the caller requests a 48-byte SHA-384 into a 48-byte buffer so all is fine. But GCC sees [64] in the function declaration and decides to emit the warning. This is correct C (according to the language definition, an array parameter is equivalent to a pointer parameter, the array size is ignored) but of course compilers are allowed to diagnose whatever they want, no matter how annoying.

I don't know how we'll fix that one.

Incidentally, why are you enabling LTO? In my limited experience, LTO doesn't improve performance. I only did very casual benchmarks though, and it probably depends on factors such as cache sizes. On the other hand LTO does help a lot when optimizing for size.

@Tachi107
Copy link
Contributor Author

Tachi107 commented Jun 9, 2022

Thanks for your fast reply :D

Note that it's a spurious warning: the sha512 functions can either output a 48-byte SHA-384 or a 64-byte SHA-512. In this case the caller requests a 48-byte SHA-384 into a 48-byte buffer so all is fine. But GCC sees [64] in the function declaration and decides to emit the warning. This is correct C (according to the language definition, an array parameter is equivalent to a pointer parameter, the array size is ignored) but of course compilers are allowed to diagnose whatever they want, no matter how annoying.

Maybe this is a dumb question, but why not substitute unsigned char output[64] with unsigned char output[48]? It would mean that the caller has to specify a buffer that is at least 48 bytes long, which is not always sufficient but is not wrong, and would probably fix this warning.

Incidentally, why are you enabling LTO? In my limited experience, LTO doesn't improve performance. I only did very casual benchmarks though, and it probably depends on factors such as cache sizes. On the other hand LTO does help a lot when optimizing for size.

It can help with some kinds of optimizations, but mainly because some environments build software with LTO enabled by default (like Ubuntu, Arch Linux, etc.), and build failures are always annoying. And I could ask you instead: why not enabling it? :)

@gilles-peskine-arm
Copy link
Contributor

As a data point, here are timings from running make test in the full configuration on my main work machine (x86_64: Intel(R) Core(TM) i7-7600U CPU @ 2.80GHz, running Ubuntu 20.04; gcc 9.4.0, clang 10.0.0), at different optimization levels. This is on a recent development with a few fixes for compiler warnings. Times are user-mode times, median of 3 runs.

Compiler Optimization Time
clang -Ofast 48.89
clang -flto -Ofast 47.68
clang -Os 45.32
clang -flto -Os 44.85
clang -Oz 45.21
clang -flto -Oz 51.63
gcc -Ofast 40.43
gcc -flto -Ofast 41.01
gcc -Os 49.97
gcc -flto -Os 49.95

For a -Ofast build, the difference in performance with -flto is within the variance. The inversion with gcc -flto -Ofast might have been bad luck: I did a few more runs and they don't seem to be consistently ranked. Obviously this can depend on the target architecture, on the exact machine (cache size etc.) and on the specific application's workload. But in general I expect this conclusion to be true: link-time optimization has small or negligible impact on the performance of Mbed TLS.

LTO does make a significant difference to code size. It seems to reduce the size of our sample programs to about 70% to 50%, with a few outliers (code that doesn't use PSA maybe?) being reduced to about 10%. That's in the full config, I'd expect less of a difference on a build with just the few cryptographic algorithms that a specification application actually needs.

We do aim to support LTO, and I'm preparing a pull request to add LTO runs to the CI and fix the warnings. But I would recommend using LTO only if you're optimizing for size.

And I could ask you instead: why not enabling it? :)

Higher risk of bugs (either due to library bugs such as subtle undefined behavior or improperly declared assembly blocks, or due to compiler bugs). Higher compilation times. No benefits if you don't care about code size.

@Tachi107
Copy link
Contributor Author

Tachi107 commented Aug 31, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug size-m Estimated task size: medium (~1w)
Projects
None yet
Development

No branches or pull requests

3 participants