From 7568691e7c7618964907c584befc0043d69772bc Mon Sep 17 00:00:00 2001 From: Jeremy Koritzinsky Date: Mon, 15 Apr 2024 15:12:44 -0700 Subject: [PATCH 1/7] Add some basic guidelines around using C++ standard headers now that our build allows us to. --- docs/coding-guidelines/clr-code-guide.md | 28 ++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/docs/coding-guidelines/clr-code-guide.md b/docs/coding-guidelines/clr-code-guide.md index ef6f7a555d615c..57af4f65b989c6 100644 --- a/docs/coding-guidelines/clr-code-guide.md +++ b/docs/coding-guidelines/clr-code-guide.md @@ -83,7 +83,11 @@ Written in 2006, by: * [2.10.4 When is it safe to use a runtime contract?](#2.10.4) * [2.10.5 Do not make unscoped changes to the ClrDebugState](#2.10.5) * [2.10.6 For more details...](#2.10.6) - * [2.11 Is your code DAC compliant?](#2.11) + * [2.11 Using standard headers](#2.11) + * [2.11.1 Do not use wchar_t](#2.11.1) + * [2.11.2 Do not use C++ standard exceptions](#2.11.2) + * [2.11.3 Do not use getenv on Unix platforms](#2.11.3) + * [2.12 Is your code DAC compliant?](#2.12) # 1 Why you must read this document @@ -1252,10 +1256,30 @@ This data is meant to be changed in a scoped manner only. In particular, the CON See the big block comment at the start of [src\inc\contract.h][contract.h]. -## 2.11 Is your code DAC compliant? +## 2.11 Using standard headers + +The C and C++ standard headers are available for usage in the CoreCLR code-base. However, there are a few restrictions on using the standard-provided APIs for code that will run as part of CoreCLR. + +Code that will only run in other processes, such as createdump or other extraneous tools, do not have many of these restrictions. + +### 2.11.1 Do not use wchar_t + +The `wchar_t` type is implementation-defined, with Windows and Unix-based platforms using different definitions (2 byte vs 4 byte). Use the `WCHAR` alias instead, which is always 2 bytes. The CoreCLR PAL provides implementations of a variety of the C standard `wchar_t` APIs with the `WCHAR` type instead. These methods, as well as the methods in the CoreCLR minipal and in the repo minipal should be used. If a CoreCLR minipal API exists, it should be used instead of the PAL API. + +### 2.11.2 Do not use C++ standard exceptions + +The exception handling mechanisms in CoreCLR only handle `Exception`-derived types and `PAL_SEHException`. As a result, standard C++ exceptions, derived from `std::exception`, will cause runtime instability and should never be used. This includes using standard collection types with the default `std::allocator` allocator. + +### 2.11.3 Do not use getenv on Unix platforms + +The POSIX API `setenv` is not thread safe with `getenv` and can lead to crashes. CoreCLR provides a `PAL_getenv` API that is thread-safe. This API should be used instead when on non-Windows platforms. + +## 2.12 Is your code DAC compliant? At a high level, DAC is a technique to enable execution of CLR algorithms from out-of-process (eg. on a memory dump). Core CLR code is compiled in a special mode (with DACCESS_COMPILE defined) where all pointer dereferences are intercepted. Various tools (most notably the debugger and SOS) rely on portions of the CLR code being properly "DACized". Writing code in this way can be tricky and error-prone. Use the following references for more details: - The best documentation is in the code itself. See the large comments at the top of [src\inc\daccess.h](https://github.com/dotnet/runtime/blob/main/src/coreclr/inc/daccess.h). + +C++ standard collections are not DAC-ized and cannot be DAC-ized, so they should never be used as fields in data structures or in global variables that need to be read by the DAC, even when using a CoreCLR compatible allocator. From fe48f4df92f96a2b90aac63ee01c56bcfcba1a85 Mon Sep 17 00:00:00 2001 From: Jeremy Koritzinsky Date: Fri, 19 Apr 2024 13:44:17 -0700 Subject: [PATCH 2/7] PR feedback on general wording. --- docs/coding-guidelines/clr-code-guide.md | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/docs/coding-guidelines/clr-code-guide.md b/docs/coding-guidelines/clr-code-guide.md index 57af4f65b989c6..b91287da964a20 100644 --- a/docs/coding-guidelines/clr-code-guide.md +++ b/docs/coding-guidelines/clr-code-guide.md @@ -1258,13 +1258,13 @@ See the big block comment at the start of [src\inc\contract.h][contract.h]. ## 2.11 Using standard headers -The C and C++ standard headers are available for usage in the CoreCLR code-base. However, there are a few restrictions on using the standard-provided APIs for code that will run as part of CoreCLR. +The C and C++ standard headers are available for usage in the CoreCLR code-base. However, there are restrictions on using the standard-provided APIs for code that will run as part of CoreCLR. -Code that will only run in other processes, such as createdump or other extraneous tools, do not have many of these restrictions. +Code that will only run in other processes, such as `createdump` or other extraneous tools, do not have the same set of restrictions. ### 2.11.1 Do not use wchar_t -The `wchar_t` type is implementation-defined, with Windows and Unix-based platforms using different definitions (2 byte vs 4 byte). Use the `WCHAR` alias instead, which is always 2 bytes. The CoreCLR PAL provides implementations of a variety of the C standard `wchar_t` APIs with the `WCHAR` type instead. These methods, as well as the methods in the CoreCLR minipal and in the repo minipal should be used. If a CoreCLR minipal API exists, it should be used instead of the PAL API. +The `wchar_t` type is implementation-defined, with Windows and Unix-based platforms using different definitions (2 byte vs 4 byte). Use the `WCHAR` alias instead, which is always 2 bytes. The CoreCLR PAL provides implementations of a variety of the C standard `wchar_t` APIs with the `WCHAR` type instead. These methods, as well as the methods in the [CoreCLR minipal](https://github.com/dotnet/runtime/tree/main/src/coreclr/minipal) and in the [repo minipal](https://github.com/dotnet/runtime/tree/main/src/native/minipal) should be used. In these minipals, the APIs may use `char16_t` or a locally-defined `CHAR16_T` type. In both cases, these types are compatible with the `WCHAR` alias in CoreCLR. If a minipal API exists, it should be used instead of the PAL API. ### 2.11.2 Do not use C++ standard exceptions @@ -1282,4 +1282,4 @@ Various tools (most notably the debugger and SOS) rely on portions of the CLR co - The best documentation is in the code itself. See the large comments at the top of [src\inc\daccess.h](https://github.com/dotnet/runtime/blob/main/src/coreclr/inc/daccess.h). -C++ standard collections are not DAC-ized and cannot be DAC-ized, so they should never be used as fields in data structures or in global variables that need to be read by the DAC, even when using a CoreCLR compatible allocator. +C++ standard collections are not DAC-ized and cannot be DAC-ized, so they should never be used as fields in data structures or in global variables that need to be read by the DAC, even when using a CoreCLR compatible allocator. They can be used as intermediate values; however. See [2.11](#2.11) for more rules about using C++ standard headers. From d76efe20110df2e05883fd842a4302bf4b87a2b0 Mon Sep 17 00:00:00 2001 From: Jeremy Koritzinsky Date: Wed, 15 May 2024 11:52:54 -0700 Subject: [PATCH 3/7] Add a blurb about libc++ --- docs/coding-guidelines/clr-code-guide.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/coding-guidelines/clr-code-guide.md b/docs/coding-guidelines/clr-code-guide.md index b91287da964a20..c30f05381944ee 100644 --- a/docs/coding-guidelines/clr-code-guide.md +++ b/docs/coding-guidelines/clr-code-guide.md @@ -1262,6 +1262,8 @@ The C and C++ standard headers are available for usage in the CoreCLR code-base. Code that will only run in other processes, such as `createdump` or other extraneous tools, do not have the same set of restrictions. +To ensure we're using a supported and easily updatable standard library implementation, we use a specially-built libc++ implementation in our shipping products. This enables us to use a modern C++ standard library implementation while still targeting older rootfs targets with outdated C++ library implementations. + ### 2.11.1 Do not use wchar_t The `wchar_t` type is implementation-defined, with Windows and Unix-based platforms using different definitions (2 byte vs 4 byte). Use the `WCHAR` alias instead, which is always 2 bytes. The CoreCLR PAL provides implementations of a variety of the C standard `wchar_t` APIs with the `WCHAR` type instead. These methods, as well as the methods in the [CoreCLR minipal](https://github.com/dotnet/runtime/tree/main/src/coreclr/minipal) and in the [repo minipal](https://github.com/dotnet/runtime/tree/main/src/native/minipal) should be used. In these minipals, the APIs may use `char16_t` or a locally-defined `CHAR16_T` type. In both cases, these types are compatible with the `WCHAR` alias in CoreCLR. If a minipal API exists, it should be used instead of the PAL API. From 090fcff3541e5c3d50eed92708661974df8c37a1 Mon Sep 17 00:00:00 2001 From: Jeremy Koritzinsky Date: Wed, 15 May 2024 14:04:44 -0700 Subject: [PATCH 4/7] Update docs/coding-guidelines/clr-code-guide.md --- docs/coding-guidelines/clr-code-guide.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/coding-guidelines/clr-code-guide.md b/docs/coding-guidelines/clr-code-guide.md index c30f05381944ee..6a573847f4d3bb 100644 --- a/docs/coding-guidelines/clr-code-guide.md +++ b/docs/coding-guidelines/clr-code-guide.md @@ -1262,7 +1262,7 @@ The C and C++ standard headers are available for usage in the CoreCLR code-base. Code that will only run in other processes, such as `createdump` or other extraneous tools, do not have the same set of restrictions. -To ensure we're using a supported and easily updatable standard library implementation, we use a specially-built libc++ implementation in our shipping products. This enables us to use a modern C++ standard library implementation while still targeting older rootfs targets with outdated C++ library implementations. +To ensure we're using a supported and easily updatable standard library implementation, we use a specially built libc++ implementation, in our Linux Docker containers, in our shipping products. This enables us to use a modern C++ standard library implementation while still targeting older rootfs targets with outdated C++ library implementations. ### 2.11.1 Do not use wchar_t From 7228291d2507bc92772a5bc27cc37c1490c91436 Mon Sep 17 00:00:00 2001 From: Jeremy Koritzinsky Date: Fri, 17 May 2024 11:13:02 -0700 Subject: [PATCH 5/7] Update statement on exception rules --- docs/coding-guidelines/clr-code-guide.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/coding-guidelines/clr-code-guide.md b/docs/coding-guidelines/clr-code-guide.md index 6a573847f4d3bb..36174447f1a437 100644 --- a/docs/coding-guidelines/clr-code-guide.md +++ b/docs/coding-guidelines/clr-code-guide.md @@ -85,7 +85,7 @@ Written in 2006, by: * [2.10.6 For more details...](#2.10.6) * [2.11 Using standard headers](#2.11) * [2.11.1 Do not use wchar_t](#2.11.1) - * [2.11.2 Do not use C++ standard exceptions](#2.11.2) + * [2.11.2 Do not use C++ Standard-defined exceptions](#2.11.2) * [2.11.3 Do not use getenv on Unix platforms](#2.11.3) * [2.12 Is your code DAC compliant?](#2.12) @@ -1268,9 +1268,9 @@ To ensure we're using a supported and easily updatable standard library implemen The `wchar_t` type is implementation-defined, with Windows and Unix-based platforms using different definitions (2 byte vs 4 byte). Use the `WCHAR` alias instead, which is always 2 bytes. The CoreCLR PAL provides implementations of a variety of the C standard `wchar_t` APIs with the `WCHAR` type instead. These methods, as well as the methods in the [CoreCLR minipal](https://github.com/dotnet/runtime/tree/main/src/coreclr/minipal) and in the [repo minipal](https://github.com/dotnet/runtime/tree/main/src/native/minipal) should be used. In these minipals, the APIs may use `char16_t` or a locally-defined `CHAR16_T` type. In both cases, these types are compatible with the `WCHAR` alias in CoreCLR. If a minipal API exists, it should be used instead of the PAL API. -### 2.11.2 Do not use C++ standard exceptions +### 2.11.2 Do not use C++ Standard-defined exceptions -The exception handling mechanisms in CoreCLR only handle `Exception`-derived types and `PAL_SEHException`. As a result, standard C++ exceptions, derived from `std::exception`, will cause runtime instability and should never be used. This includes using standard collection types with the default `std::allocator` allocator. +The exception handling mechanisms in CoreCLR only handle `Exception`-derived types and `PAL_SEHException`. As a result, standard C++ exceptions, derived from `std::exception`, will cause runtime instability and should never be used. There is one standard C++ exception type the CoreCLR infrastructure supports, `std::bad_alloc`. Since CoreCLR supports `std::bad_alloc`, the standard container allocators, `std::allocator` and the standard C++ containers, can be used as long as only the non-throwing members are used. ### 2.11.3 Do not use getenv on Unix platforms From f507ff6e5d02968f822f306a4bd403068e6f3d55 Mon Sep 17 00:00:00 2001 From: Jeremy Koritzinsky Date: Fri, 17 May 2024 11:27:59 -0700 Subject: [PATCH 6/7] Provide an example of a throwing member and a mechanism to figure out what throws and what doesn't. --- docs/coding-guidelines/clr-code-guide.md | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/coding-guidelines/clr-code-guide.md b/docs/coding-guidelines/clr-code-guide.md index 36174447f1a437..9ff02d0cebac3a 100644 --- a/docs/coding-guidelines/clr-code-guide.md +++ b/docs/coding-guidelines/clr-code-guide.md @@ -1272,6 +1272,8 @@ The `wchar_t` type is implementation-defined, with Windows and Unix-based platfo The exception handling mechanisms in CoreCLR only handle `Exception`-derived types and `PAL_SEHException`. As a result, standard C++ exceptions, derived from `std::exception`, will cause runtime instability and should never be used. There is one standard C++ exception type the CoreCLR infrastructure supports, `std::bad_alloc`. Since CoreCLR supports `std::bad_alloc`, the standard container allocators, `std::allocator` and the standard C++ containers, can be used as long as only the non-throwing members are used. +For example, `std::vector::at()` should not be used as it may throw an `std::out_of_range` exception. Check the C++ standard or [cppreference.com](https://en.cppreference.com) for each member you plan to use to ensure that it will not throw a C++ standard exception other than `std::bad_alloc`. + ### 2.11.3 Do not use getenv on Unix platforms The POSIX API `setenv` is not thread safe with `getenv` and can lead to crashes. CoreCLR provides a `PAL_getenv` API that is thread-safe. This API should be used instead when on non-Windows platforms. From b0b4789d338b42bbcc558b5cf9d80f01c3f9644f Mon Sep 17 00:00:00 2001 From: Jeremy Koritzinsky Date: Thu, 8 Aug 2024 10:08:38 -0700 Subject: [PATCH 7/7] The libc++ experiment failed. Update the guide --- docs/coding-guidelines/clr-code-guide.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/docs/coding-guidelines/clr-code-guide.md b/docs/coding-guidelines/clr-code-guide.md index 9ff02d0cebac3a..3aee9554520bab 100644 --- a/docs/coding-guidelines/clr-code-guide.md +++ b/docs/coding-guidelines/clr-code-guide.md @@ -1262,8 +1262,6 @@ The C and C++ standard headers are available for usage in the CoreCLR code-base. Code that will only run in other processes, such as `createdump` or other extraneous tools, do not have the same set of restrictions. -To ensure we're using a supported and easily updatable standard library implementation, we use a specially built libc++ implementation, in our Linux Docker containers, in our shipping products. This enables us to use a modern C++ standard library implementation while still targeting older rootfs targets with outdated C++ library implementations. - ### 2.11.1 Do not use wchar_t The `wchar_t` type is implementation-defined, with Windows and Unix-based platforms using different definitions (2 byte vs 4 byte). Use the `WCHAR` alias instead, which is always 2 bytes. The CoreCLR PAL provides implementations of a variety of the C standard `wchar_t` APIs with the `WCHAR` type instead. These methods, as well as the methods in the [CoreCLR minipal](https://github.com/dotnet/runtime/tree/main/src/coreclr/minipal) and in the [repo minipal](https://github.com/dotnet/runtime/tree/main/src/native/minipal) should be used. In these minipals, the APIs may use `char16_t` or a locally-defined `CHAR16_T` type. In both cases, these types are compatible with the `WCHAR` alias in CoreCLR. If a minipal API exists, it should be used instead of the PAL API. @@ -1278,6 +1276,14 @@ For example, `std::vector::at()` should not be used as it may throw an `std:: The POSIX API `setenv` is not thread safe with `getenv` and can lead to crashes. CoreCLR provides a `PAL_getenv` API that is thread-safe. This API should be used instead when on non-Windows platforms. +### 2.11.4 Limit usage of standard template types in shipping executables + +For Linux x64 and amd64 platforms, we build against a very old libstdc++, the version that shipped with Ubuntu 16.04. As a result, we strive to reduce our usage of template types (where code from the headers will be inserted into our binaries) in shipping executables and libraries. + +This rule applies to both `coreclr` as well as shipping external executables like `createdump`. + +For non-shipping native code, like the `superpmi` tools suite, standard headers can be used without limitation. + ## 2.12 Is your code DAC compliant? At a high level, DAC is a technique to enable execution of CLR algorithms from out-of-process (eg. on a memory dump). Core CLR code is compiled in a special mode (with DACCESS_COMPILE defined) where all pointer dereferences are intercepted.