diff --git a/docs/_static/custom.css b/docs/_static/custom.css index 88bb9413f29..d020a9af026 100644 --- a/docs/_static/custom.css +++ b/docs/_static/custom.css @@ -67,3 +67,9 @@ pre.productionlist code.literal { .tab-set { margin-bottom: 1.5rem; } + +@media not print, (prefers-color-scheme: dark) { + body:not([data-theme="light"]) .highlight .hll { + background-color: rgba(235, 218, 177, 0.10); + } +} diff --git a/docs/source-2.0/glossary.rst b/docs/source-2.0/glossary.rst new file mode 100644 index 00000000000..dce54b1fffa --- /dev/null +++ b/docs/source-2.0/glossary.rst @@ -0,0 +1,152 @@ +-------- +Glossary +-------- + +.. index:: foo + +.. glossary:: + + AbstractCodeWriter + A Java class in the :term:`Smithy reference implementation` used to + generate code for a :term:`target environment`. Find the + `source code on GitHub `__. + + codegen + Code generation + Code generators output source code to represent the shapes defined + in a :term:`Smithy model`, and the code they generate use both the + standard library and the :term:`runtime libraries` for the + :term:`target environment`. + + codegen-core + A `set of Java libraries `__ + built on top of the :term:`Smithy reference implementation` that are + used to implement Smithy code generators. codegen-core contains + libraries for writing code, managing dependencies, managing imports, + converting Smithy shapes to :term:`symbols` of a :term:`target environment`, + :term:`reserved words` handling, etc. + + Gradle + `Gradle `__ is a build tool for Java, Kotlin, and + other languages. Gradle is typically the build system used to develop + Smithy code generators. The Smithy team + `maintains a Gradle plugin `__ for + running Smithy-Build via Gradle. + + Integrations + Integrations are code generator plugins. Integrations are defined by + each code generator. They can be used to preprocess the model, modify + the SymbolProvider used during code generation, add dependencies for + the target environment, generate additional files, register protocol + code generators, add configuration options to clients, etc. + + .. seealso:: :doc:`guides/codegen/making-codegen-pluggable` + + Java Service Provider Interface + SPI + A feature for loading and discovering implemenations of a Java interface. + SPI is used throughout the :term:`Smithy reference implementation` as + a plugin system. See the `Java documentation `__ + for more information. + + Knowledge index + Abstractions provided in the :term:`Smithy reference implementation` + that extract information from the metamodel in a more accessible way. + For example, the `HttpBindingIndex`_ makes it easier to codegen HTTP + bindings, and the `NullableIndex`_ hides the details of determining + if a member is optional or always present. + + Projection + A specific view of a Smithy model that has added, removed, or + transformed model components. + + .. seealso:: :doc:`guides/building-models/build-config` + + Reserved words + Identifiers and words that cannot be used in a + :term:`target environment`. Reserved words can be contextual or global + to the target language (for example, a word might only reserved when + used as a structure property but not when used as the name of a + shape). Code generators are expected to automatically translate + reserved words into an identifier that is safe to use in the + target environment. + + Runtime libraries + The libraries used at runtime in a :term:`target environment`. + For example, HTTP clients, type implementations like + big decimal, etc. + + Semantic model + The Smithy semantic model is an in-memory representation the shapes, + traits, and metadata defined in the Smithy model. In Smithy's + reference implementation, the semantic model is contained in the + `Model class`_. + + Serde + Shortened version of serialization and deserialization. + + Service closure + The shapes connected to a service. These shapes are code generated. + + Shapes + Shapes are named declarations of Smithy types that make up the + :term:`semantic model`. + + Smithy-Build + A model transformation framework built on top of the + :term:`Smithy reference implementation`. Code generators are + implemented as :ref:`smithy-build ` plugins. + + smithy-build.json + The file used to configure Smithy-Build. Code generators are configured + and executed by adding plugins to smithy-build.json files in various + projections. + + .. seealso:: :doc:`guides/building-models/build-config` + + Smithy model + Smithy models define services, operations, resources, and shapes. + Smithy models are made up of one or more files to form the + semantic model. Model files can use a JSON or IDL representation. + + Smithy reference implementation + The Java implementation of Smithy that is used to load, validate, + transform, and extract information from Smithy models. + + Smithy type + The types of shapes that can be defined in a Smithy model (for example, + string, integer, structure, etc.). + + Symbol + Symbols + The qualified name of a type in a target programming language. Symbols + are used to map Smithy shapes to types in a :term:`target environment`, + refer to language types, and refer to libraries that might be needed by + the generated code. A symbol contains an optional namespace, optional + namespace delimiter, name, a map of additional properties, a + declaration file the determines where the symbol is declared, and a + definition file that determines where a symbol is defined. Symbols can + also contain *SymbolDependencies* that are used to automatically manage + imports in a CodeWriter and to generate dependency closures for the + target environment. + + SymbolProvider + A SymbolProvider is used to generate Symbols for Smithy shapes and + members. SymbolProviders can be decorated to provided additional + functionality like automatically renaming reserved words. + + Target environment + The intended programming language and specific environment of a code + generator. For example, TypeScript running in the browser is a target + environment. + + Traits + Traits are model components that can be attached to :ref:`shapes ` + to describe additional information about the shape; shapes provide + the structure and layout of an API, while traits provide refinement + and style. Code generators use traits to influence generated code. + + +.. _HttpBindingIndex: https://github.com/awslabs/smithy/blob/main/smithy-model/src/main/java/software/amazon/smithy/model/knowledge/HttpBindingIndex.java +.. _NullableIndex: https://github.com/awslabs/smithy/blob/main/smithy-model/src/main/java/software/amazon/smithy/model/knowledge/NullableIndex.java +.. _Model class: https://github.com/awslabs/smithy/blob/main/smithy-model/src/main/java/software/amazon/smithy/model/Model.java diff --git a/docs/source-2.0/guides/building-models/build-config.rst b/docs/source-2.0/guides/building-models/build-config.rst index 58268539e1b..3fc0e690cca 100644 --- a/docs/source-2.0/guides/building-models/build-config.rst +++ b/docs/source-2.0/guides/building-models/build-config.rst @@ -1,3 +1,5 @@ +.. _smithy-build: + ============ smithy-build ============ diff --git a/docs/source-2.0/guides/codegen/configuring-the-generator.rst b/docs/source-2.0/guides/codegen/configuring-the-generator.rst new file mode 100644 index 00000000000..02da854f227 --- /dev/null +++ b/docs/source-2.0/guides/codegen/configuring-the-generator.rst @@ -0,0 +1,256 @@ +------------------------- +Configuring the Generator +------------------------- + +This document provides guidance on how to configure a code generator. + + +Introduction +============ + +Smithy code generators are configured using plugins defined in +smithy-build.json files. For example: + +.. code-block:: json + :emphasize-lines: 4-9 + + { + "version": "1.0", + "plugins": { + "foo-client-codegen": { + "service": "smithy.example#Weather", + "package": "com.example.weather", + "edition": "2023" + } + } + } + + +How to name codegen plugins +=========================== + +Smithy code generation plugins should use a naming pattern of +``--codegen``, where: + +* ```` is the name of the programming language +* ```` is one of "client", "server", or "types" +* ``codegen`` is the kind of plugin (in this case, a code generator) + +Examples: + +* ``foo-service-codegen``: generate a hypothetical Foo language service +* ``foo-client-codegen``: generate a hypothetical Foo language client + + +Recommended properties +====================== + +Every codegen plugin should support the following properties, and may +choose to introduce any other properties as needed. + + +``service`` +----------- + +The ``service`` property defines the service shape ID to generate (for +example, ``"smithy.example#Weather"``). Smithy models can contain +multiple services. Providing a ``service`` shape ID tells code +generators which service to generate. + +Generators may choose to make ``service`` optional. If optional, the +generator will attempt to find every service in the model. If only a +single service is found in the model, it is used for code generation. If +multiple services are found, the generator should fail and require an +explicit service shape ID. + + +``protocol`` (client and type codegen only) +------------------------------------------- + +Defines the optional Smithy protocol to use for the generated client. +This value is provided as a shape ID that refers to the protocol trait +(for example, ``"aws.protocols#restJson1"``). This protocol shape ID +must be present in the model and applied as a trait to the resolved +``service`` shape to generate. If no protocol is provided, the generator +should find all the protocol traits attached to the resolved ``service`` +and choose which protocol to use for code generation. It is up to the +generator to prioritize and choose protocols. + +.. note:: + + The ``protocol`` setting is typically only used by clients because + a service is expected to support every protocol of the service, while + a client can choose to connect over a single protocol. + + +``edition`` +----------- + +The ``edition`` property configures the code generator to use the best +practices defined as of a specific date (for example, ``2023``). +Editions should automatically enable and disable other feature-gates in +a generator. For example, if the TypeScript code generator decided that +there needs to be a new way to generate unions, then they could continue +to support the existing union behavior, add a feature gate to generate +the new union code, and eventually add a new edition that enables this +feature by default. + +Editions in Smithy code generators are basically the same thing as +`editions in +Rust `__. +They configure the Smithy code generator to take on new default behavior +as use cases evolve, features are added to the target language, or we +learn from customer feedback that we didn't get an abstraction right. + +It is highly recommended that you make ``edition`` **required** to force +end users to opt-in to an edition rather than use a default edition. +Avoiding a default in this case makes it much more likely that new and +improved code generation features will be used by new users rather than +them naively sticking with an outdated edition simply because it's the +default. + + +``relativeDate`` (client and type codegen only) +----------------------------------------------- + +Causes code generation to omit shapes that were deprecated prior to the +given ISO 8601 date (``YYYY-MM-DD``). + +While other relativization transforms can be added in the future, +setting ``relativeDate`` causes shapes marked with the :ref:`deprecated-trait` +that have a "since" version that lexicographically comes before the provided +value to be omitted from the generated code. If the shape uses a ``since`` +value that does not follow the ``YYYY-MM-DD`` format, then the shape is +included regardless of the deprecated trait. + +For example, consider the following model: + +.. code-block:: smithy + + service Foo { + operations: [PutA, CreateA] + } + + @deprecated(since: "2019-06-11") + operation PutA { + input:= {} + output:= {} + } + + operation CreateA { + input:= {} + output:= {} + } + +If ``relativeDate`` is set to ``2023-04-15``, then the ``PutA`` +operation, its inputs, and outputs are omitted from codegen because the +``since`` value of the trait comes before the provided date. + + +``relativeVersion`` (client and type codegen only) +-------------------------------------------------- + +This setting provides the same behavior as ``relativeDate``, but uses +`Semantic Versioning `__ rather than a date-based +versioning strategy. The provided string value is parsed into a SemVer +representation and compared against the ``since`` property of shapes +marked as ``@deprecated``. If the ``@deprecated`` trait uses a ``since`` +value that is not a valid SemVer string, then the shape is included. + +For example, consider the following model: + +.. code-block:: smithy + + service Foo { + operations: [PutA, CreateA] + } + + @deprecated(since: "2.4") + operation PutA { + input:= {} + output:= {} + } + + operation CreateA { + input:= {} + output:= {} + } + +If ``relativeVersion`` is set to ``3.0``, then the ``PutA`` operation is +omitted from codegen because the ``since`` value of the trait is an +earlier version than the provided version. + +.. note:: + + ``relativeVersion`` and ``relativeDate`` can be used in tandem. + + +Converting JSON configuration to Java +===================================== + +Configuration settings are parsed into generic "node" objects that +Smithy-Build plugins can then deserialize into strongly typed `Java +records `__ +or POJOs. For example: + +.. code-block:: java + + public final class FooCodegenSettings { + private ShapeId service; + private String packageName; + private String edition; + + public ShapeId getService() { + return service; + } + + public void setService(ShapeId service) { + this.service = service; + } + + public String getPackage() { + return packageName; + } + + public void setPackage(String packageName) { + this.packageName = packageName; + } + + public void getEdition(String edition) { + this.edition = edition; + } + + public String setEdition() { + return edition; + } + } + +You can use :ref:`directedcodegen` to +easily wire up the POJO to your generator. Wiring up the configuration +provided to the plugin to the generator can be done in +``SmithyBuildPlugin#execute`` using ``CodegenDirector#setting``. + +.. code-block:: java + :emphasize-lines: 12 + + public final class FooCodegenPlugin implements SmithyBuildPlugin { + @Override + public String getName() { + return "foo-client-codegen"; + } + + @Override + public void execute(PluginContext context) { + CodegenDirector + runner = new CodegenDirector<>(); + runner.directedCodegen(new DirectedFooCodegen()); + runner.settings(FooCodegenSettings.class, context.getSettings()); + // ... + runner.run(); + } + } + +.. seealso:: + + * :ref:`codegen-creating-smithy-build-plugin` + * :ref:`running-directedcodegen` diff --git a/docs/source-2.0/guides/codegen/creating-codegen-repo.rst b/docs/source-2.0/guides/codegen/creating-codegen-repo.rst new file mode 100644 index 00000000000..91dd1698e28 --- /dev/null +++ b/docs/source-2.0/guides/codegen/creating-codegen-repo.rst @@ -0,0 +1,274 @@ +----------------------- +Creating a Codegen Repo +----------------------- + +You'll want to create a repository for a Smithy code generator. Most Smithy +generators use Git repos hosted on GitHub. Smithy codegen repos are usually +titled "smithy-" where "" is the target programming language. These repos +contain: + +1. Generic Smithy code generation, typically written in Java. +2. Runtime libraries used by the code generator. +3. Gradle build tooling to publish the code generator to places like + `Maven Central`_. This is important as it allows others to use your code + generator in their own projects. +4. Language-specific build tooling to build and publish the Smithy + runtime libraries to language-specific artifact repositories (e.g., + Maven Central, NPM, RubyGems, crates.io, etc.). + + +Example codegen repositories +============================ + +Here are a few example Smithy codegen repos created by AWS: + +- https://github.com/awslabs/smithy-typescript +- https://github.com/aws/smithy-go +- https://github.com/awslabs/smithy-rs +- https://github.com/awslabs/smithy-ruby +- https://github.com/awslabs/smithy-kotlin +- https://github.com/awslabs/smithy-swift +- https://github.com/awslabs/smithy-python + + +Codegen repo layout +=================== + +The root of a Smithy codegen repo should look and appear like a +typical repository for the target programming language. Code generation +should be isolated to a subdirectory named ``codegen`` that contains a +`multi-module Gradle package`_. A multi-module layout allows you to create +the code generator and an example package used to integration test the +generator. Java based repos will have a layout similar to the following: + +.. code-block:: none + + . + ├── CHANGES.md + ├── CODE_OF_CONDUCT.md + ├── CONTRIBUTING.md + ├── LICENSE + ├── NOTICE + ├── README.md + ├── codegen + │ ├── README.md + │ ├── build.gradle.kts + │ ├── config + │ │ ├── checkstyle + │ │ │ ├── checkstyle.xml + │ │ │ └── suppressions.xml + │ │ └── spotbugs + │ │ └── filter.xml + │ ├── gradle + │ │ └── wrapper + │ │ ├── gradle-wrapper.jar + │ │ └── gradle-wrapper.properties + │ ├── gradle.properties + │ ├── gradlew + │ ├── gradlew.bat + │ ├── settings.gradle.kts + │ ├── smithy-mylang-codegen + │ │ ├── build.gradle.kts + │ │ └── src + │ │ ├── main + │ │ │ ├── java + │ │ │ │ └── software + │ │ │ │ └── amazon + │ │ │ │ └── smithy + │ │ │ │ └── mylang + │ │ │ │ └── codegen + │ │ │ │ └── MylangCodegenPlugin.java + │ │ │ └── resources + │ │ │ └── META-INF + │ │ │ └── services + │ │ │ └── software.amazon.smithy.build.SmithyBuildPlugin + │ │ └── test + │ │ ├── java + │ │ │ └── software + │ │ │ └── amazon + │ │ │ └── smithy + │ │ │ └── mylang + │ │ │ └── codegen + │ │ │ └── MylangCodegenPluginTest.java + │ │ └── resources + │ │ └── software + │ │ └── amazon + │ │ └── smithy + │ │ └── mylang + │ │ └── codegen + │ └── smithy-mylang-codegen-test + │ ├── build.gradle.kts + │ ├── model + │ │ ├── main.smithy + │ └── smithy-build.json + └── designs + + +Directory descriptions +---------------------- + +- ``codegen/``: All Smithy codegen functionality should appear in a + sub-directory. +- ``codegen/smithy-mylang-codegen/``: Where the code generator is + implemented in Java. Rename "mylang" to your generator's name. This + project should eventually be published to Maven Central. +- ``codegen/smithy-mylang-codegen-test/``: A test project used to + exercise the code generator. This project should not be published to + Maven Central. +- ``designs/``: Public design documents. It's useful to publish design + documents for the repo so consumers of the repo know how Smithy is + mapped to the target environment and what tradeoffs were made in the + implementation. + + +.. _codegen-creating-smithy-build-plugin: + +Creating a Smithy-Build plugin +============================== + +The entry point to any Smithy code generator is a Smithy-Build plugin +implementation of ``software.amazon.smithy.build.SmithyBuildPlugin``. +This plugin is discovered on the classpath and tells Smithy-Build what +plugin name it implements. For example, the simplest plugin looks +something like this: + +.. code-block:: java + + package software.amazon.smithy.mylang.codegen; + + import software.amazon.smithy.build.PluginContext; + import software.amazon.smithy.build.SmithyBuildPlugin; + + /** + * Plugin to perform Mylang code generation. + */ + public final class MylangCodegenPlugin implements SmithyBuildPlugin { + @Override + public String getName() { + // Tell Smithy-Build which plugin this is. + return "mylang-codegen"; + } + + @Override + public void execute(PluginContext context) { + // Create and run the generator using the provided context. + new MylangCodeGenerator(context).run(); + } + } + +Java is made aware of the plugin by adding the name of the plugin class +into a special META-INF file in: + +.. code-block:: none + + codegen/smithy-mylang-codegen/src/main/resources/META-INF/services/software.amazon.smithy.build.SmithyBuildPlugin + +The file will contain a line that contains the full Java class name of +the plugin: + +.. code-block:: none + + software.amazon.smithy.mylang.codegen.MylangCodegenPlugin + +The next step is to implement the code generator. + + +Using Gradle +============ + +Smithy codegen projects typically use Gradle as a build tool for +compiling JARs, running JUnit tests, running Checkstyle, running +SpotBugs, and publishing JARs to Maven Central. + + +Running unit tests +------------------ + +Gradle by default looks for JUnit tests in +``codegen/smithy-mylang-codegen/src/test/java``. Tests are run using the +following command: + +.. code-block:: none + + ./gradlew :smithy-mylang-codegen:test + +(where ``:smithy-mylang-codegen`` is the module name to test and +``test`` is the target action to run). + + +Using Gradle with local packages +-------------------------------- + +When developing a Smithy code generator, you'll often need to work with +unreleased changes of the Smithy repo in other repos like an AWS SDK +code generator. If you use the Smithy codegen template repository, it +will automatically use whatever it finds in Maven Local, a local Maven +repository on your computer, rather than something like Maven Central, a +remote repository. You can add packages to Maven local using Gradle: + +.. code-block:: none + + ./gradlew :smithy-mylang-codegen:pTML + +If you need to use unreleased changes to +`awslabs/smithy `__, then clone the +repository and run: + +.. code-block:: none + + ./gradlew pTML + + +FAQ +=== + +Do I have to use Gradle? +------------------------ + +No, you can use any build tool you'd like. All the Smithy codegen +implementations built by AWS as of January 2023 use Gradle to build their +generators, so it is likely the path of least resistance. Gradle has +plenty of usability issues, but it can do basically anything you'll +need, including publishing your generator to Maven Central. If you use +something other than Gradle, you might have extra work to do to create a +test project that generates code from a Smithy model. + + +Can I use Kotlin to do codegen? +------------------------------- + +You can use any language you want to build a Smithy generator. If you're +building a Smithy code generator for an officially supported AWS SDK, +you are strongly encouraged to understand the business implications of +using Kotlin. Smithy's reference implementation is written in Java, +which a Kotlin code generator would use. However, building a Smithy code +generator in Java requires a team to learn and use Java. Using Kotlin +requires the team to learn Java *and* Kotlin. + + +I'm also building an AWS SDK. Where should that code go? +-------------------------------------------------------- + +There are various approaches you can take. The typical approach is to +have one GitHub repo dedicated to Smithy code generation and another +dedicated to the AWS SDK. Smithy is not AWS-specific and must be able +to generate code for teams outside of Amazon. + +For branding and discoverability, official AWS SDKs should all be +available in GitHub repos dedicated to that SDK. This repository should +have a ``codegen`` module in a sub-directory that depends on and extends +the generic Smithy code generator for the language. + + +When should I publish codegen packages to Maven Central? +-------------------------------------------------------- + +Publish codegen packages to Maven Central just like any other software +project — when there are changes you want your consumers to use, +including the AWS SDK. AWS SDK code generators should also be published +to Maven Central to allow developers to generate code that uses +AWS signature version 4 or any AWS protocols. + + +.. _Maven Central: https://search.maven.org +.. _multi-module Gradle package: https://docs.gradle.org/current/userguide/multi_project_builds.html#multi_project_builds diff --git a/docs/source-2.0/guides/codegen/decoupling-codegen-with-symbols.rst b/docs/source-2.0/guides/codegen/decoupling-codegen-with-symbols.rst new file mode 100644 index 00000000000..4a6d58aadb5 --- /dev/null +++ b/docs/source-2.0/guides/codegen/decoupling-codegen-with-symbols.rst @@ -0,0 +1,762 @@ +------------------------------- +Decoupling Codegen with Symbols +------------------------------- + +:term:`Symbols` are used in Smithy code generators to refer to qualified types +in the target programming language. Symbols provide a layer of abstraction +between the code being generated and the logic used to determine: + +* how shapes are named +* where types are declared and defined +* the runtime dependencies needed for a type +* the imports needed to define and reference a type + + +Quick Symbol example +==================== + +The following example uses the built-in "``T``" formatter of +`SymbolWriter `_ +to write symbols to the generated code and automatically add imports to +the file: + +.. code-block:: java + + // Create MyWriter, an imaginary subclass of SymbolWriter. + // Set the namespace of the writer to "example.foo", which internally + // calls SymbolWriter#relativizeSymbols, passing in "example.foo". + var writer = new MyWriter("example.foo"); + + // Create an example Symbol that refers to a type in the same namespace + // as the writer's current namespace. Setting the the namespace also + // requires the namespace separator. + var exampleSymbol = Symbol.builder() + .name("Example") + .namespace("example.foo", ".") + .build(); + + // "$T" in the following call to 'write' is replaced with just + // the name of the Symbol ("Example") because the Symbol's + // namespace matches the relative namespace of the writer. + // If exampleSymbol was in a different namespace, the writer + // would either write the fully-qualified name or would add + // an import to the file for the symbol. + writer.write(""" + if ($T.isEmpty()) { + doSomething(); + } + """, + exampleSymbol); + + +Benefits of Symbols +=================== + +- Refactoring. Symbols make it easy to refactor how shapes are generated + and the file location of where shapes are generated. +- Managing imports. Symbols can contain the list of other symbols + needed in order to refer to the type. For example, if a list has + generic type parameters (e.g., ``List``), then the symbol + that refers to the list would also contain symbols referring to the + generic type parameters. All the symbols needed in a generated code + file can then be used to automatically create import statements as + symbols are referenced by a CodeWriter. +- Managing dependencies. Symbols can define the dependencies needed by + the generated code in order to reference the symbol. For example, if + BigDecimal implementations are provided by a third-party library, + then a dependency is needed by the generated code in order to + reference BigDecimal in code. Symbols carry this information because + they are implementations of a ``SymbolDependencyContainer``. After + writing code to a CodeWriter, all the dependencies of each referenced + symbol can be used to generate a dependency graph (e.g., Maven poms, + Python setup.py files, etc). +- Readability. Referencing symbols using ``$T`` makes the string + templates passed to CodeWriter more succinct. + + +Referencing Symbols using SymbolReference +========================================= + +A `SymbolReference`_ is used to refer to another Symbol. For example, a +SymbolReference is used when a Symbol uses generic type parameters. The +following example is a Symbol created for a list of +``example.foo.Example`` values in a Java-like language, +``collections.List``: + +.. code-block:: java + + // Create the "Example" Symbol used in the "List". + var exampleSymbol = Symbol.builder() + .name("Example") + .namespace("example.foo", ".") + .build(); + + // Create a "List" that can be properly relativized + // against the current namespace of the SymbolWriter. + var listSymbol = Symbol.builder() + .name("List") + .namespace("collections", ".") + // Automatically creates a SymbolReference from a Symbol. + .addReference(exampleSymbol) + .build(); + + +Aliasing Symbols +---------------- + +An alias can be given to SymbolReference to deconflict the symbol with +other symbols already in the target namespace. For example, let's say +you need to reference a type that uses a fairly common name, so you +decide to alias the common name to something that is far less likely to +have conflicts. + +The following example creates a Symbol for ``List<__Example>`` where +``__Example`` is an alias to ``com.foo.Example``: + +.. code-block:: java + + var exampleSymbol = Symbol.builder() + .name("Example") + .namespace("example.foo", ".") + .build(); + + // Alias "Example" to "__Example". + SymbolReference exampleReference = exampleSymbol.toReference("__Example"); + + // Create a "List<__Example>" that can be properly relativized + // against the current namespace of the SymbolWriter. + var listSymbol = Symbol.builder() + .name("List") + .namespace("collections", ".") + .addReference(exampleReference) + .build(); + +When a SymbolReference is added to a Symbol, ``SymbolWriter`` will know +that the references of the Symbol need to be accounted for when writing +the symbol by importing any necessary dependencies with appropriate +aliases. + +.. code-block:: java + + // Hypothetical example of managing imports and using references. + var writer = new MyWriter("example.other.namespace"); + + writer.write(""" + var list = new $T(); + """, + listSymbol); + + assert(writer.toString().equals(""" + package example.other.namespace; + + import collections.List; + import example.foo.Example as __Example; + + var list = new List<__Example>(); + """)); + + +Symbol dependencies +=================== + +Symbols can be used to automatically generate dependency closures and +configuration files based on the symbols written to a ``SymbolWriter``. +This allows generated code to only depend on only the closure of +dependencies they actually need. Codegen plugins can conditionally +require runtime dependencies in generated code (something needed by AWS +SDKs to add dependencies on AWS signature version 4 implementations, +credential providers, etc). + +The symbols used during codegen can be tracked using a ``WriterDelegator``, +and from these tracked symbols, the graph of referenced ``SymbolDependency`` +can be written to whatever dependency manifest format is needed for the +target environment. + +Dependencies are registered with a Symbol by creating a +`SymbolDependency `_ +and adding them to the Symbol via ``Symbol#addDependency``. + +The following example creates a TypeScript Symbol for big decimal that +refers to a type defined in a package named ``big``: + +.. code-block:: java + + // This dependency is needed in JavaScript. + var bigRuntimeDependency = SymbolDependency.builder() + .dependencyType("dependencies") + .packageName("big") + .version("^5.2.2") + .build(); + + // This dependency is needed by the TypeScript compiler. + var bigTsDependency = SymbolDependency.builder() + .dependencyType("devDependencies") + .packageName("@types/big.js") + .version("^4.0.5") + .build(); + + // Create a symbol used for big decimals in Smithy. + var big = Symbol.builder() + .name("Big") + .namespace("big", "/") + .addDependency(bigRuntimeDependency) + .addDependency(bigTsDependency) + .build(); + +As you can see in the above example, symbol dependencies can have a +dependency type that is used to classify when the dependency is needed +(see ``SymbolDependency#dependencyType``). It is common in TypeScript +libraries to need different dependencies for JavaScript code vs +TypeScript type definitions, so two dependencies were added to the +created ``big`` symbol: one that is a normal "dependencies" and one that +is a "``devDependencies``". + + +``SymbolDependency`` best practices +----------------------------------- + +Creating a ``SymbolDependency`` in each place the dependency is needed +spreads them out all of a project, making it difficult to change +dependencies. Rather than create a ``SymbolDependency`` each time they +are needed in the code generator, a better practice is to create a +dedicated Java ``enum`` that contains each Symbol used in the project. +This enum can be referenced throughout a project, making it possible to +update version numbers in a single place. + +For example: + +.. code-block:: java + + /** + * An enum of all of the built-in dependencies managed by this package. + */ + public enum TypeScriptDependency implements SymbolDependencyContainer { + + // Conditionally added if a big decimal shape is found in a model. + BIG_JS("dependencies", "big.js", "^5.2.2"), + TYPES_BIG_JS("devDependencies", "@types/big.js", "^4.0.5"); + + public static final String NORMAL_DEPENDENCY = "dependencies"; + public static final String DEV_DEPENDENCY = "devDependencies"; + public static final String PEER_DEPENDENCY = "peerDependencies"; + public static final String BUNDLED_DEPENDENCY = "bundledDependencies"; + public static final String OPTIONAL_DEPENDENCY = "optionalDependencies"; + + public final SymbolDependency dependency; + + TypeScriptDependency(String type, String name, String version) { + this.dependency = SymbolDependency.builder() + .dependencyType(type) + .packageName(name) + .version(version) + .build(); + } + + @Override + public List getDependencies() { + return Collections.singletonList(dependency); + } + } + +.. note:: + + 1. The ``enum`` implements `SymbolDependencyContainer`_, an abstraction + for composing dependencies. + 2. This example is taken from + `smithy-typescript `__, + which shows other possibilities like how to define unconditional + dependencies that are needed by every client. + + +Tracking externally controlled dependencies +------------------------------------------- + +A `DependencyTracker`_ can be used to track available dependencies using a +JSON file that can be then used to provide version numbers to an enum. This +can be useful if version numbers are maintained outside a code generator or +need to be translated from other formats or lock files. For example: + +.. code-block:: java + + /** + * An enum of all of the built-in dependencies managed by this package. + */ + public enum TypeScriptDependency implements SymbolDependencyContainer { + + // Conditionally added if a big decimal shape is found in a model. + BIG_JS("big.js"), + TYPES_BIG_JS("@types/big.js"); + + public final SymbolDependency dependency; + + TypeScriptDependency(String name) { + this.dependency = VersionFile.VERSIONS.getByName(name); + } + + @Override + public List getDependencies() { + return Collections.singletonList(dependency); + } + + private static final class VersionFile { + private static final DependencyTracker VERSIONS = new DependencyTracker(); + static { + String path = "sdkVersions.json"; + VERSIONS.addDependenciesFromJson(SdkVersion.class.getResource(path)); + } + } + } + + +Converting shapes to Symbols with ``SymbolProviders`` +===================================================== + +A ``SymbolProvider`` is used to convert Smithy shapes to Symbols. A +``SymbolProvider`` is the brains of a Smithy code generator; it tells +the code generator the types used to represent shapes in the model, the +dependencies needed by generated code, the filenames used to declare and +define types, and automatically ensures reserved words in the target +language are accounted for during codegen. + +A selection of existing ``SymbolProviders`` can be found at: + +1. TypeScript: + https://github.com/awslabs/smithy-typescript/blob/main/smithy-typescript-codegen/src/main/java/software/amazon/smithy/typescript/codegen/SymbolVisitor.java +2. Python: + https://github.com/awslabs/smithy-python/blob/develop/codegen/smithy-python-codegen/src/main/java/software/amazon/smithy/python/codegen/SymbolVisitor.java +3. Go: + https://github.com/aws/smithy-go/blob/main/codegen/smithy-go-codegen/src/main/java/software/amazon/smithy/go/codegen/SymbolVisitor.java + +The simplest way to implement a ``SymbolProvider`` is to also implement +``ShapeVisitor``. The basic setup will look something like this: + +.. code-block:: java + + package software.amazon.smithy.python.codegen; + + import java.util.logging.Logger; + import software.amazon.smithy.codegen.core.SymbolProvider; + import software.amazon.smithy.model.Model; + import software.amazon.smithy.model.shapes.ServiceShape; + import software.amazon.smithy.model.shapes.Shape; + import software.amazon.smithy.model.shapes.ShapeVisitor; + + final class SymbolVisitor implements SymbolProvider, ShapeVisitor { + + private static final Logger LOGGER = Logger.getLogger(SymbolVisitor.class.getName()); + + private final Model model; + private final MySettings settings; + private final ServiceShape service; + + SymbolVisitor(Model model, MySettings settings) { + this.model = model; + this.settings = settings; + this.service = model.expectShape(settings.getService(), ServiceShape.class); + } + + @Override + public Symbol toSymbol(Shape shape) { + Symbol symbol = shape.accept(this); + LOGGER.fine(() -> format("Creating symbol from %s: %s", shape, symbol)); + // TODO: Escape reserved words. + return symbol; + } + + @Override + public Symbol structureShape(StructureShape shape) { + String name = getDefaultShapeName(shape); + + // Generate errors differently than normal structures. + if (shape.hasTrait(ErrorTrait.class)) { + return createErrorStructure(shape); + } else { + return createNormalStructure(shape); + } + } + + private Symbol createErrorStructure(StructureShape shape) { + throw new UnsupportedOperationException("Error type codegen not yet implemented"); + } + + private Symbol createNormalStructure(StructureShape shape) { + return Symbol.builder() + .name(name) + // Change this to however the settings object configures the + // target namespace and the namespace separator for the + // language. + .namespace(settings.getNamespace(), ".") + // Change this to however filenames should work each generated + // type. If this changes, then the files used to generate + // code should automatically change too. + .definitionFile("models/" + name + ".xyz") + .build(); + } + + private String getDefaultShapeName(Shape shape) { + // Use the service-aliased name and ensure it's capitalized. + return StringUtils.capitalize(shape.getId().getName(service)); + } + + // TODO implement other ShapeVisitor methods. + } + + +Automatically handling reserved words +------------------------------------- + +Smithy code generators are expected to automatically ensure that code +generated from a model do not create invalid code. Service teams +defining Smithy models should not need to know the intricacies of how +Smithy models are converted to every programming language. Instead, +Smithy code generators should ensure that reserved words in a target +environment are not used by the ``SymbolProvider``. The +``smithy-codegen-core`` library provides several abstractions for +handling reserved words. These abstractions should be used in your +``SymbolProvider``. + +The primary abstraction is the `ReservedWords`_ interface. The +`ReservedWordsBuilder`_ class provides a convenient way to build an instance +of ``ReservedWords``. These ``ReservedWords`` instances should be +integrated into your ``SymbolProvider`` by passing the created names, +namespaces, and member names through the appropriate escaper. For +example: + +.. code-block:: java + :emphasize-lines: 4,8-11,18 + + final class SymbolVisitor implements SymbolProvider, ShapeVisitor { + + // ... other properties + private final ReservedWords escaper; + + SymbolVisitor(Model model, MySettings settings) { + // ... other setup + this.escaper = new ReservedWordsBuilder() + .put("function", service.getId().getName() + "Function") + .put("throw", service.getId().getName() + "Throw") + .build(); + } + + // other methods... + + private String getDefaultShapeName(Shape shape) { + String name = StringUtils.capitalize(shape.getId().getName(service)); + return escaper.escape(name); + } + } + +While you can manually define the mapping for each reserved word, a +simpler method is to create an algorithm for automatically handling +reserved words. This can be done by creating a newline delimited file +that contains each reserved word and a ``Function`` that +takes a reserved word and returns an escaped word. For example, given +the following file named *reservedwords.txt*: + +.. code-block:: none + + function + throw + +``ReservedWordsBuilder`` can be configured to escape words using the +file and your escaping function. + +.. code-block:: java + + Function escaper = word -> { + // Returns something like "MyServiceFunction". + return service.getId().getName() + StringUtils.capitalize(word); + }); + + URL wordsFile = getClass().getResource("reservedwords.txt"); + + ReservedWords escaper = new ReservedWordsBuilder() + .loadWords(wordsFile, escaper) + .build(); + +Reserved words handling should be as granular as possible. If a symbol +is only reserved in certain contexts, then that word should only be +treated as reserved in that context. This might require the use of +multiple instances of ``ReservedWords``. + +.. code-block:: java + + var memberNameEscaper = new ReservedWordsBuilder() + .loadWords(memberNameWordsFile, escaper) + .build(); + + var classNameEscaper = new ReservedWordsBuilder() + .loadWords(classNameWordsFile, escaper) + .build(); + +Reserved words handling is case-sensitive by default. You can use +reserved words file case insensitively using +``ReservedWordsBuilder#loadCaseInsensitiveWords``. + +.. code-block:: java + + var escaper = new ReservedWordsBuilder() + .loadCaseInsensitiveWords(wordsFile, escaper) + .build(); + + +Composing ``SymbolProviders`` +----------------------------- + +``SymbolProvider`` has a very simple interface, making it easy to +compose functionality using decorators. Decorators can be used to do +things like add caching or add more contextual data to Symbols. + +The following example decorates a ``SymbolProvider`` by adding caching +of resolved Symbols: + +.. code-block:: java + + var cachedProvider = SymbolProvider.caching(mySymbolProvider); + +The following example creates a decorator that adds a "shape" property +to every Symbol: + +.. code-block:: java + + final class MyCodegenPlugin { + static SymbolProvider wrapSymbolProvider(SymbolProvider delegate) { + return shape -> { + return delegate.toSymbol(shape).toBuilder() + .putProperty("shape", shape) + .build(); + }; + } + } + + var wrapped = MyCodegenPlugin.wrapSymbolProvider(mySymbolProvider); + + +Integrating Symbols into your ``SymbolWriter`` +============================================== + +``SymbolWriter`` provides some building blocks to help integrate Symbols +into a particular programming language, but the actual gluing together +of abstractions, generating import statements, generating dependencies, +accounting for aliasing, etc is an exercise left to each language +implementation of Smithy. + + +Create an ``ImportContainer`` for your language +----------------------------------------------- + +An `ImportContainer`_ is used to track the imports associated with a +specific file being generated. Each time a ``Symbol`` is written to a +`SymbolWriter`_, and each call to methods like ``SymbolWriter#addImport`` +are called, ``Symbol``\ s are sent to the ``ImportContainer`` owned by +the ``SymbolWriter``. The ``ImportContainer`` should be aware of the +current namespace in use by the ``SymbolWriter``. + +The following example implements a simple ``ImportContainer`` for a made +up language. If a provided ``Symbol`` is in the same namespace that the +container is tracking, the import is discarded. Otherwise, each import +is added to a map of namespaces to a map of alias → target name. + +.. code-block:: java + + final class MyLangImports implements ImportContainer { + private final Map> imports = new TreeMap<>(); + private final MyLangSettings settings; + private final String namespace; + + MyLangImports(MyLangSettings settings, String namespace) { + this.settings = settings; + this.namespace = namespace; + } + + @Override + public void importSymbol(Symbol symbol, String alias) { + var symbolNamespace = symbol.getNamespace(); + + // Only import symbols in other namespaces. + if (!symbolNamespace.equals(namespace)) { + var namespaceImports = imports.computeIfAbsent(symbolNamespace, ns -> new TreeMap<>()); + namespaceImports.put(alias, symbol.getName()); + } + } + + @Override + public String toString() { + if (imports.isEmpty()) { + return ""; + } + + // Build up each line of import statements. + var builder = new StringBuilder(); + + for (var entry : imports.entrySet()) { + var ns = entry.getKey(); + var alias = entry.getValue().getKey(); + var target = entry.getValue().getValue(); + builder.append("import ").append(target); + + // Use a made up aliasing syntax if the alias differs from the target. + if (!alias.equals(target)) { + builder.append(" as ").append(alias) + } + + // Import from a target namespace. + builder.append(" from ").append(ns); + } + + builder.append("\n"); + return builder.toString(); + } + } + +``ImportContainer`` implements ``toString`` so that ``SymbolWriter`` can +write out imports before writing out the rest of the code. + + +Create a ``SymbolWriter`` subclass +---------------------------------- + +Each language should create a subclass of `SymbolWriter`_ that +automatically manages imports, symbols, and writes documentation +strings. + +The following example shows how a subclass of ``SymbolWriter`` can be +created. + +.. code-block:: java + + public final class MyWriter extends SymbolWriter { + + public MyWriter(String namespace) { + super(new MyImportContainer(namespace)); + + // Write Symbols relative to the current namespace. + setRelativizeSymbols(namespace); + } + + @Override + public String toString() { + // You can override how code is converted to a string. For example, + // this allows you to add a prelude generated code or to write the + // necessary imports that were used in the writer. + return getImportContainer().toString() + "\n\n" + super.toString(); + } + + public MyWriter someCustomMethod() { + // You can implement custom methods that are specific to whatever + // language you're implementing a generator for. + return this; + } + } + + +Use ``WriterDelegator`` to create writers +----------------------------------------- + +In order to track the dependencies used while generating code, and to +add code interceptors to each created ``SymbolWriter``, code generators +should use a `WriterDelegator`_ to create ``SymbolWriters``. + +A ``WriterDelegator`` is used to create and track all the +``SymbolWriters`` used during code generation. A codegen project will +generally use a single ``WriterDelegator`` during codegen. A code +generator should register all code interceptors returns from +integrations with each created ``SymbolWriter`` (this is something you +need to do manually). + +Let's say you need to generate code for a structure shape. You ask the +``WriterDelegator`` to give you the appropriate ``SymbolWriter``: + +.. code-block:: java + + delegator.useShapeWriter(shape, writer -> { + writer.write("Structure $L", shape.getId()); + }); + +``WriterDelegator`` will create the appropriate ``SymbolWriter`` that +writes to the correct file location based on the ``Symbol`` created for +the given shape. If multiple shapes use the same filename, then +``WriterDelegator`` will provide the same ``SymbolWriter`` to each call +to ``useShapeWriter``, and it will automatically inject ``\n`` prior to +vending a previously used writer (this can be customized). + +Use ``useFileWriter`` to write to a file that isn't specific to a shape: + +.. code-block:: java + + delegator.useFileWriter("README.md", writer -> { + writer.write(""" + # This is my README! + + Do you like it? + """); + }); + +When codegen has completed, the generator needs to call ``flushWriters`` +on the delegator to write each created ``SymbolWriter`` to the +``FileManifest`` the generator is using: + +.. code-block:: java + + delegator.flushWriters(); + +All the symbol dependencies detected when using each ``SymbolWriter`` +can be retrieved from the delegator using ``getDependencies``. + +.. code-block:: java + + List dependencies = delegator.getDependencies(); + +These dependencies can then be used to generate things like dependency +manifests for the created code. + + +FAQ +=== + +How do I add more information to ``Symbols``, ``SymbolReferences``, and ``SymbolDependencies``? +----------------------------------------------------------------------------------------------- + +Use typed property bags to store additional information. For example: + +.. code-block:: java + + Symbol foo = Symbol.builder() + .name("Foo") + .namespace("example.foo", ".") + .putProperty("customData", "hello") + .build(); + + String customData = foo.getProperty("customData", String.class); + +You can add properties to an existing ``Symbol``, ``SymbolReference``, +or ``SymbolDependency`` by calling ``toBuilder`` first: + +.. code-block:: java + + foo = foo.toBuilder() + .putProperty("anotherProperty", true) + .build(); + + +Does ``SymbolWriter`` require one namespace per file? +----------------------------------------------------- + +No, but that's the easiest way to use ``SymbolWriter``. Your language's +subclass can be setup in a way that it uses multiple ``ImportContainer`` +instances per/namespace in a single file. For example, an ``ImportContainer`` +could be given the current namespace of a ``SymbolWriter`` each time it's +invoked, allowing the ``ImportContainer`` to perform more targeted +relativization. Then the ``ImportContainer`` would need special methods +used to convert each nested namespace's imports to a string. It's an +abstract exercise left up to the implementation. + + +.. _SymbolReference: https://github.com/awslabs/smithy/blob/main/smithy-codegen-core/src/main/java/software/amazon/smithy/codegen/core/SymbolReference.java +.. _SymbolDependencyContainer: https://github.com/awslabs/smithy/blob/main/smithy-codegen-core/src/main/java/software/amazon/smithy/codegen/core/SymbolDependencyContainer.java +.. _DependencyTracker: https://github.com/awslabs/smithy/blob/main/smithy-codegen-core/src/main/java/software/amazon/smithy/codegen/core/DependencyTracker.java +.. _ReservedWords: https://github.com/awslabs/smithy/blob/main/smithy-codegen-core/src/main/java/software/amazon/smithy/codegen/core/ReservedWords.java +.. _ReservedWordsBuilder: https://github.com/awslabs/smithy/blob/main/smithy-codegen-core/src/main/java/software/amazon/smithy/codegen/core/ReservedWordsBuilder.java +.. _ImportContainer: https://github.com/awslabs/smithy/blob/main/smithy-codegen-core/src/main/java/software/amazon/smithy/codegen/core/ImportContainer.java +.. _SymbolWriter: https://github.com/awslabs/smithy/blob/main/smithy-codegen-core/src/main/java/software/amazon/smithy/codegen/core/SymbolWriter.java +.. _WriterDelegator: https://github.com/awslabs/smithy/blob/main/smithy-codegen-core/src/main/java/software/amazon/smithy/codegen/core/WriterDelegator.java diff --git a/docs/source-2.0/guides/codegen/generating-code.rst b/docs/source-2.0/guides/codegen/generating-code.rst new file mode 100644 index 00000000000..54d4e5cc9cd --- /dev/null +++ b/docs/source-2.0/guides/codegen/generating-code.rst @@ -0,0 +1,678 @@ +--------------- +Generating Code +--------------- + +A "code writer" is the main abstraction used to generate code. It can be +used to write basically any kind of code, including whitespace sensitive +and brace-based. The following example generates some Python code: + +.. code-block:: java + + SimpleCodeWriter writer = new SimpleCodeWriter(); + + writer.write("def Foo(str):") + .indent() + .write("print str"); + + String code = writer.toString(); + +There are few kinds of code writers: + +- `AbstractCodeWriter`_: An abstract class that can be extended to + create language-specific writers. +- `SimpleCodeWriter`_: An implementation of ``AbstractCodeWriter`` with no + added methods or support for ``Symbol``\ s. +- `SymbolWriter`_: An abstract class that extends ``AbstractCodeWriter``, + available in `software.amazon.smithy:smithy-codegen-core `__. + This class adds abstractions for managing imports and dependencies. + Smithy code generators should extend this class. + See :doc:`decoupling-codegen-with-symbols`. + + +``AbstractCodeWriter`` is a lightweight template engine +======================================================= + +An ``AbstractCodeWriter`` can be used as a lightweight templating +language. It supports interpolation, formatting, +:ref:`intercepting named sections of the generated content `, +conditionals, and loops. This removes the need to add a dependency on a Java +templating engine and the need to integrate Smithy Symbols and dependency +management into other templating languages. The following example uses Java 17 +text blocks to generate a contiguous section of code: + +.. code-block:: java + + writer.pushState(); + + // Add variables that can be referenced in templates. + writer.putContext("name", settings.getModuleName()); + writer.putContext("version", settings.getModuleVersion()); + writer.putContext("description", settings.getModuleDescription()); + + writer.write(""" + [flake8] + # We ignore E203, E501 for this project due to black + ignore = E203,E501 + + [metadata] + name = ${name:L} + version = ${version:L} + description = ${description:L} + license = Apache-2.0 + python_requires = >=3.10 + classifiers = + Development Status :: 2 - Pre-Alpha + Intended Audience :: Developers + Intended Audience :: System Administrators + Natural Language :: English + License :: OSI Approved :: Apache Software License + Programming Language :: Python + Programming Language :: Python :: 3 + Programming Language :: Python :: 3 :: Only + Programming Language :: Python :: 3.10 + """); + + writer.popState(); + + +Interpolation +============= + +Various methods like ``write()`` and ``writeInline()`` take a template +string and a variadic list of arguments that are *interpolated*, or +replaced, into the expression. + +In the following example, ``$L`` is interpolated and replaced with the +relative argument, ``"there!"``. + +.. code-block:: java + + CodeWriter writer = new SimpleCodeWriter(); + writer.write("Hello, $L", "there!"); + assert(writer.toString().equals("Hello, there!\n")); + +The ``$`` character is escaped using ``$$``. + +.. code-block:: java + + SimpleCodeWriter writer = new SimpleCodeWriter().write("$$L"); + assert(writer.toString().equals("$L\n")); + +The default character used to start an expression is ``$``, but this can +be changed for the current state of the ``AbstractCodeWriter`` by +calling ``setExpressionStart(char)``. This might be useful for +programming languages that make heavy use of ``$`` like PHP or Kotlin. A +custom start character can be escaped using two start characters in a +row. For example, given a custom start character of ``#``, ``#`` can be +escaped using ``##``. + +.. code-block:: java + + SimpleCodeWriter writer = new SimpleCodeWriter(); + writer.setExpressionStart('#'); + writer.write("#L ##L $L", "hi"); + assert(writer.toString().equals("hi #L $L\n")); + + +Formatters +========== + +An ``AbstractCodeWriter`` supports three kinds of interpolations: +relative, positional, and named. Each of these kinds of interpolations +pass a value to a *formatter*. Formatters are named functions that +accept an object as input, accepts a string that contains the current +indentation (it can be ignored if not useful), and returns a string as +output. ``AbstractCodeWriter`` registers two built-in formatters: + +- ``L`` (literal): Outputs a literal value of an ``Object`` using the + following implementation: (1) A null value is formatted as "". (2) An + empty ``Optional`` value is formatted as"". (3) A non-empty + ``Optional`` value is formatted using the value inside the + ``Optional``. (3) All other values are formatted using the result of + calling Java's ``String#valueOf``. +- ``S`` (string): Adds double quotes around the result of formatting a + value first using the default literal "L" implementation described + above and then wrapping the value in an escaped string safe for use + in Java according to + https://docs.oracle.com/javase/specs/jls/se7/html/jls-3.html#jls-3.10.6. + This should work for many programming languages, but this formatter + can be overridden if needed. +- ``C`` (call): Used to break up a template and execute code at + specific locations. ``$C`` stands for "call" and is used to run a + ``Runnable`` or ``Consumer`` that is expected to + write to the same writer. Any text written to the writer is used as + the interpolation result. Note that a single trailing newline is + removed from the captured text. If a ``Runnable`` is provided, it is + required to have a reference to the writer. A ``Consumer`` is + provided a reference to the writer as a single argument. Using a + ``Consumer`` makes it possible to create more generic method for + handling different sections of code. +- …: Custom formatters can be registered using + ``AbstractCodeWriter#putFormatter``. Registering custom formatters + with a writer for common formatting tasks is a great way to simplify + a code generator. + + +Relative parameters +=================== + +Placeholders in the form of "$" followed by a formatter name are treated +as relative parameters. The first instance of a relative parameter +interpolates the first positional argument, the second, etc. All +relative arguments must be used as part of an expression and relative +interpolation cannot be mixed with positional variables. + +.. code-block:: java + + SimpleCodeWriter writer = new SimpleCodeWriter(); + writer.write("$L $L $L", "a", "b", "c"); + assert(writer.toString().equals("a b c\n")); + + +Positional parameters +===================== + +Placeholders in the form of "$" followed by a positive number, followed +by a formatter name are treated as positional parameters. The number +refers to the 1-based index of the argument to interpolate. All +positional arguments must be used as part of an expression and relative +interpolation cannot be mixed with positional variables. + +.. code-block:: java + + SimpleCodeWriter writer = new SimpleCodeWriter(); + writer.write("$1L $2L $3L, $3L $2L $1L", "a", "b", "c"); + assert(writer.toString().equals("a b c c b a\n")); + + +Named parameters +================ + +Named parameters are parameters that take a value from the context of +the current state. They take the following form +``$:``, where ```` is a string that +starts with a lowercase letter, followed by any number of +``[A-Za-z0-9_#$.]`` characters, and ```` is the name of a +formatter. + +.. code-block:: java + + SimpleCodeWriter writer = new SimpleCodeWriter(); + writer.pushState(); + writer.putContext("foo", "a"); + write.putContext("bar", "b"); + writer.write("$foo:L $bar:L"); + writer.popState(); + assert(writer.toString().equals("a b\n")); + + +.. _inline-block-alignment: + +Inline block alignment +====================== + +Sometimes it's necessary to maintain the exact indentation level of an +interpolated property even if newlines are written when interpolating. +For example, say we wanted to indent a variable list of names, +``Bob\nKaren\nLuis``, like this: + +.. code-block:: none + + Names: Bob + Karen + Luis + +Using normal ``$L`` expansion: + +.. code-block:: java + + writer.write("$L: $L", "Names", "Bob\nKaren\nLuis"); + +``$L`` does not preserve the desired indentation, resulting in: + +.. code-block:: none + + Names: Bob + Karen + Luis + +Indentation can be preserved to match the desired list from the first +example by using the inline block alignment operator (that is, putting +``|`` before the closing brace): + +.. code-block:: java + + writer.write("$L: ${L|}", "Names", "Bob\nKaren\nLuis"); + +If all the characters on the line in the template leading up to the +interpolation are spaces or tabs, then those characters are applied +before each new line. This means that block alignment works even with +tab-based languages: + +.. code-block:: java + + writer.write(""" + if (true) { + \t\t${C|} + } + """, + writer.call(w -> w.write("Hi\nHello")) + ); + +Outputs: + +.. code-block:: none + + if (true) { + \t\tHi + \t\tHello + } + + +Breaking up large templates with the ``$C`` formatter +===================================================== + +The ``$C`` formatter can be used to break up large codegen templates +without losing the readability benefits of `Java text blocks`_. +The ``$C`` formatter pairs well with inline block alignment, allowing +you to generate indented sections of code within a larger template. + +The following example uses the ``call`` method of an +``AbstractCodeWriter`` to properly type the ``Function``, and a method +reference is provided to invoke a method that accepts the writer. + +.. code-block:: java + + void someMethod() { + writer.write(""" + if (true) { + ${C|} + } else { + ${C|} + } + """, + writer.call(this::handleTrue), + writer.call(this::handleFalse)); + } + + void handleTrue(CodeWriter writer) { + writer.write("True!"); + } + + void handleFalse(CodeWriter writer) { + writer.write("False!"); + } + +.. tip:: + + When generating code, try to show the overall structure of the + code that will be generated as much as possible in larger blocks of + templated text that leverage ``${C|}``, template conditionals (e.g., + ``${?foo}${/foo}``), and template loops (e.g., ``${#foo}${/foo}``). + + +Pushing and popping states +========================== + +``AbstractCodeWriter`` maintains a stack of transformation states, +including the text used to indent, a prefix to add before each line, +newline character, the number of times to indent, a map of context +values, whether whitespace is trimmed from the end of newlines, whether +the automatic insertion of newlines is disabled, the character used to +start code expressions (defaults to ``$``), and formatters. + +State can be pushed onto the stack using ``pushState`` which copies the +current state. Mutations can then be made to the top-most state of the +``AbstractCodeWriter`` and do not affect previous states. The previous +transformation state of the ``AbstractCodeWriter`` can later be restored +using ``popState``. + +.. code-block:: java + + SimpleCodeWriter writer = new SimpleCodeWriter(); + writer + .pushState() + .write("/**") + .setNewlinePrefix(" * ") + .write("This is some docs.") + .write("And more docs.\n\n\n") + .write("Foo.") + .popState() + .write(" */"); + +The above example outputs: + +.. code-block:: none + + /** + * This is some docs. + * And more docs. + * + * Foo. + */ + +``AbstractCodeWriter`` maintains some global state that is not affected +by ``pushState()`` and ``popState()``: + +- The number of successive blank lines to trim. +- Whether a trailing newline is inserted or removed from the result of + converting the ``AbstractCodeWriter`` to a string. + + +Limiting blank lines +==================== + +Many coding standards recommend limiting the number of successive blank +lines. This can be handled automatically by ``AbstractCodeWriter`` by +calling ``trimBlankLines()``. The removal of blank lines is handled when +the ``AbstractCodeWriter`` is converted to a string. Lines that consist +solely of spaces or tabs are considered blank. If the number of blank +lines exceeds the allowed threshold, they are omitted from the result. + +.. code-block:: java + + SimpleCodeWriter writer = new SimpleCodeWriter(); + writer.trimBlankLines(); + writer.write("hello\n\n\n\nhello"); + assert(writer.toString().equals("hello\n\nhello\n")); + +In the above example, ``\n\n\n\n`` results in two blank lines (two +newlines outputs an entirely blank line). ``AbstractCodeWriter`` trims +the successive blank line, resulting in ``"hello\n\nhello\n"`` (the +trailing newline is added by ``AbstractCodeWriter`` by default +separately). Two blank lines could be allowed if the above example was +updated to pass ``2`` into ``trimBlankLines``: + +.. code-block:: java + + writer.trimBlankLines(2); + + +Trimming trailing spaces +======================== + +Many coding standards do not allow trailing spaces on lines. Trailing +spaces can be automatically trimmed from each line by calling +``trimTrailingSpaces()``. + +.. code-block:: java + + SimpleCodeWriter writer = new SimpleCodeWriter(); + writer.trimTrailingSpaces(); + writer.write("hello "); + assert(writer.toString().equals("hello")); + + +Code sections +============= + +Named sections can be marked in the code writer that can be intercepted +and modified by *section interceptors*. This gives the +``AbstractCodeWriter`` an extension system for augmenting generated +code. A section of code can be captured using a *block section* or an +*inline section*. + + +Block sections +-------------- + +The primary method for creating sections of code is block sections. A +block section is created by passing a string or an implementation of +``CodeSection`` to ``pushState()``. A string gives the state a name and +captures all the output written inside this state to an internal buffer. +This buffer is then passed to each registered interceptor for that name. +These interceptors can choose to use the default contents of the section +or emit entirely different content. + +.. code-block:: java + + SimpleCodeWriter writer = new SimpleCodeWriter(); + + writer.onSection("example", text -> { + writer.write("Intercepted: " + text); + }); + + writer.pushState("example"); + writer.write("Original contents"); + writer.popState(); + assert(writer.toString().equals("Intercepted: Original contents\n")); + +A better method for creating and intercepting code sections is to use an +instance of a ``CodeSection``. A ``CodeSection`` is a simple interface +that is just required to return the name of the ``CodeSection`` (in +fact, using a string for ``pushState`` internally creates a +``CodeSection``). + +Java records are an easy way to implement ``CodeSection``\ s: + +.. code-block:: java + + record NameEvent(String sectionName, String person) implements CodeSection; + +``CodeInterceptor``\ s can be registered to intercept sections by class. +A simple way to create one-off interceptors is using +``CodeSection#appender``: + +.. code-block:: java + + writer.onSection(CodeInterceptor.appender(NameEvent.class, (w, section) -> { + w.write("$L", section.sectionName())); + })); + + writer.onSection(CodeInterceptor.appender(NameEvent.class, (w, section) -> { + w.writeInline("Who? "); + })); + + writer.onSection(CodeInterceptor.appender(NameEvent.class, (w, section) -> { + w.write("$L!", section.person()); + })); + +When a ``CodeSection`` is given to ``pushState`` or ``injectState``, +``CodeInterceptor``\ s are applied in the order they were registered. + +.. code-block:: java + + NameEvent event = new NameEvent("Zak", "McKracken"); + writer.injectSection(event); + +When applied, the ``writer`` contains the following output: + +.. code-block:: none + + Zak + Who? McKracken! + + +Inline sections +--------------- + +An *inline section* is created using a special ``AbstractCodeWriter`` +interpolation format that appends "@" followed by the section name. +Inline sections are function just like block sections, but they can +appear inline inside other content passed in calls to +``AbstractCodeWriter#write()``. + +Inline sections are created in a format string inside braced arguments +after the formatter. For example, ``${L@foo}`` is an inline section that +uses the literal "L" value of a relative argument as the default value +of the section and allows interceptors registered for the "foo" section +to make calls to the ``AbstractCodeWriter`` to modify the section. + +.. code-block:: java + + SimpleCodeWriter writer = new SimpleCodeWriter(); + + // Add an intercept for the "example" section. + writer.onSection("example", text -> writer.write("Intercepted: " + text)); + + // Write to the writer and define an inline "example" section. + // If nothing intercepts this section, "foo" is written to it. + writer.write("Leading...${L@example}...Trailing...", "foo"); + + assert(writer.toString().equals("Leading...Intercepted: foo...Trailing...\n")); + +.. note:: + + An inline section that makes no calls to ``AbstractCodeWriter#write()`` + expands to an empty string. + + +Template conditions and loops +============================= + +Conditional blocks can be defined in code writer templates using the +following syntax: + +.. code-block:: java + + writer.write(""" + ${?foo} + Foo is set: ${foo:L} + ${/foo}"""); + +Assuming ``foo`` is *truthy* and set to "hi", then the above template +outputs: "Foo is set: hi" In the above example, "?" indicates that the +expression is a conditional block to check if the named context property +"foo" is truthy. If it is, then the contents of the block up to the +matching closing block, ``${/foo}``, are evaluated. If the condition is +not satisfied, then contents of the block are skipped. + +You can check if a named context property is *falsey* using "^": + +.. code-block:: java + + writer.write(""" + ${^foo} + Foo is not set + ${/foo}"""); + +Assuming ``foo`` is set to "hi", then the above template outputs +nothing. If ``foo`` is falsey, then the above template outputs "Foo is +not set". + + +Truthy and falsey values +------------------------ + +The following values are considered falsey: + +- properties that are not found +- null values +- false +- empty `String `__ +- empty `Iterable `__ +- empty `Map `__ +- empty `Optional `__ + +Values that are not falsey are considered truthy. + + +Loops +----- + +Loops can be created to repeat a section of a template for each value +stored in a list or each key value pair stored in a map. Loops are +created using ``#``. + +The following template with a "foo" value of +``{"key1": "a", "key2": "b", "key3": "c"}``: + +.. code-block:: java + + writer.write(""" + ${#foo} + - ${key:L}: ${value:L} (first: ${key.first:L}, last: ${key.last:L}) + ${/foo} + """); + +Evaluates to: + +.. code-block:: none + + - key1: a (first: true, last: false) + - key2: b (first: false, last: false) + - key3: c (first: false, last: true) + +Each iteration of the loop pushes a new state in the writer that sets +the following context properties: + +- ``key``: contains the current 0-based index of an iterator or the + current key of a map entry +- ``value``: contains the current value of an iterator or current value + of a map entry +- ``key.first``: set to true if the loop is on the first iteration +- ``key.false``: set to true if the loop is on the last iteration + +A custom variable name can be used in loop variable bindings. For +example: + +.. code-block:: java + + writer.write(""" + ${#foo as key1, value1} + - ${key1:L}: ${value1:L} (first: ${key1.first:L}, last: ${key1.last:L}) + ${/foo}"""); + + +Whitespace control +------------------ + +Conditional blocks and loop blocks that occur on lines that only contain +whitespace are not written to the template output. For example, if +``foo`` in the following template is falsey, then the template expands +to an empty string: + +.. code-block:: java + + writer.write(""" + ${?foo} + Foo is set: ${foo:L} + ${/foo}"""); + +Whitespace that comes before a template expression can be removed by +putting ``-`` at the beginning of the expression. + +Assuming that the first positional argument is "hi": + +.. code-block:: java + + writer.write(""" + Greeting: + ${-L}"""); + +Expands to: + +.. code-block:: none + + Greeting:hi\n + +Whitespace that comes after a template expression can be removed by +adding ``-`` to the end of the expression: + +.. code-block:: java + + writer.write(""" + ${L-} + + ."""); + +Expands to: + +.. code-block:: none + + hi.\n\n + +Leading whitespace cannot be removed when using +:ref:`inline block alignment ` +(``|``). The following is *invalid*: + +.. code-block:: java + + writer.write("${-C|}"); + // ^ ^ invalid combination + + +.. _AbstractCodeWriter: https://github.com/awslabs/smithy/blob/main/smithy-utils/src/main/java/software/amazon/smithy/utils/AbstractCodeWriter.java +.. _SimpleCodeWriter: https://github.com/awslabs/smithy/blob/main/smithy-utils/src/main/java/software/amazon/smithy/utils/SimpleCodeWriter.java +.. _SymbolWriter: https://github.com/awslabs/smithy/blob/main/smithy-codegen-core/src/main/java/software/amazon/smithy/codegen/core/SymbolWriter.java +.. _Java text blocks: https://docs.oracle.com/en/java/javase/13/text_blocks/index.html diff --git a/docs/source-2.0/guides/codegen/implementing-the-generator.rst b/docs/source-2.0/guides/codegen/implementing-the-generator.rst new file mode 100644 index 00000000000..ad211b6b220 --- /dev/null +++ b/docs/source-2.0/guides/codegen/implementing-the-generator.rst @@ -0,0 +1,222 @@ +-------------------------- +Implementing the Generator +-------------------------- + +This document describes how to implement a code generator using the +high-level `DirectedCodegen `__ +interface. + + +.. _directedcodegen: + +DirectedCodegen +=============== + +Smithy code generators typically all follow the same patterns. In fact, +the layout of existing code generators is so similar that a kind of +"golden-path" codegen architecture was designed called +*directed codegen*. The ``DirectedCodegen`` interface and ``CodegenDirector`` +class provide a kind of guided template for building a code generator. + +``DirectedCodegen`` brings together all the opinionated abstractions for +implementing a Smithy code generator. + +- ``Symbol`` and ``SymbolProvider`` classes used to map shapes to code + and decouple this logic from templates. (see :doc:`decoupling-codegen-with-symbols`) +- A language-specific ``SymbolWriter`` subclass used to generate code + using a simple template engine. (see :doc:`generating-code`) +- A ``SmithyIntegration`` subtype used to provide extension points to + the generator. (see :doc:`making-codegen-pluggable`) +- Easy to find helper methods for getting information from the model. + (see :doc:`using-the-semantic-model`) +- Pre-defined "directives" that tell you the kinds of shape and trait + combinations that need to be code generated. + + +Implementing ``DirectedCodegen`` +================================ + +The methods of ``DirectedCodegen`` break down the process of building up +and running a generator into specific methods. + +.. code-block:: java + + public interface DirectedCodegen, S, I extends SmithyIntegration> { + + SymbolProvider createSymbolProvider(CreateSymbolProviderDirective directive); + + C createContext(CreateContextDirective directive); + + void generateService(GenerateServiceDirective directive); + + default void generateResource(GenerateResourceDirective directive) {} + + void generateStructure(GenerateStructureDirective directive); + + void generateError(GenerateErrorDirective directive); + + void generateUnion(GenerateUnionDirective directive); + + void generateEnumShape(GenerateEnumDirective directive); + + void generateIntEnumShape(GenerateIntEnumDirective directive); + + default void customizeBeforeShapeGeneration(CustomizeDirective directive) {} + + default void customizeBeforeIntegrations(CustomizeDirective directive) {} + + default void customizeAfterIntegrations(CustomizeDirective directive) {} + } + +The `source code for DirectedCodegen`_ can be found on GitHub. + + +DirectedCodegen prerequisites +----------------------------- + +``DirectedCodegen`` has a few prerequisites before it can be implemented. + +- A ``SymbolProvider`` implementation used to map Smithy shapes to + Symbols (type ``S``). :doc:`mapping-shapes-to-languages` provides + guidance on how shapes should map to a programming language, and + :doc:`decoupling-codegen-with-symbols` describes how + Symbols are used to perform the actual mapping. +- A specific implementation of ``CodegenContext`` (type ``C``). This object + provides access to codegen settings, abstractions for writing files, and + abstractions for creating code writers. This context object has its + own prerequisites: + + - A settings object for the code generator. This context object + contains the codegen settings passed to your generator through + ``smithy-build.json`` plugins (see :doc:`configuring-the-generator`). + - A subclass of ``SymbolWriter`` used to generate code for the + target language. +- A ``SmithyIntegration`` implementation used to make the generator + extensible (type ``I``). + + +.. _running-directedcodegen: + +Running ``DirectedCodegen`` using a ``CodegenDirector`` +======================================================= + +A ``CodegenDirector`` is used in concert with a ``DirectedCodegen`` +implementation to build up the context needed to run the generator and +call methods in the right order. ``CodegenDirector`` is typically called +in a Smithy-Build plugin using the data provided by +``software.amazon.smithy.build.PluginContext``. + +.. code-block:: java + + @Override + public void execute(PluginContext context) { + CodegenDirector runner = new CodegenDirector<>(); + + // Assuming MylangGenerator is an implementation of DirectedCodegen. + runner.directedCodegen(new MylangGenerator()); + + // Set the SmithyIntegration class to look for and apply using SPI. + runner.integrationClass(TestIntegration.class); + + // Set the FileManifest and Model from the plugin. + runner.fileManifest(context.getFileManifest()); + runner.model(context.getModel()); + + // Create a MylangSettings object from the plugin settings. + MylangSettings settings = runner.settings(MylangSettings.class, + context.getSettings()); + + // Assuming service() returns the configured service shape ID. + runner.service(settings.service()); + + // Configure the director to perform some common model transforms. + runner.performDefaultCodegenTransforms(); + runner.createDedicatedInputsAndOutputs(); + + runner.run(); + } + +After performing the above steps, ``CodegenDirector`` will: + +1. Perform any requested model transformations +2. Automatically find implementations of your ``SmithyIntegration`` + class using Java SPI. These implementations are then used throughout + the rest of code generation. +3. Register the ``CodeInterceptors`` from each ``SmithyIntegration`` + with your ``WriterDelegator`` +4. Call each ``generate``\ \* method in a topologically sorted order + (that is, things with no references to other shapes come before + shapes that reference them) +5. Call ``DirectedCodegen#customizeBeforeIntegrations`` +6. Run the ``customize`` method of each ``SmithyIntegration`` +7. Call ``DirectedCodegen#customizeAfterIntegrations`` +8. Flush any open ``SymbolWriter``\ s in your ``WriterDelegator``. + + +Creating a settings class +========================= + +A code generator uses a settings object to configure the generator in +Smithy-Build and during directed code generation. At a minimum, this +class should have a ``ShapeId`` for the service to generate. + +.. code-block:: java + + public final class MylangSettings { + private ShapeId service; + + public void service(ShapeId service) { + this.service = service; + } + + public ShapeId service() { + return service; + } + } + +.. seealso:: :doc:`configuring-the-generator` defines recommended settings + + +Creating a ``CodegenContext`` class +=================================== + +This object provides access to codegen settings, abstractions for +writing files, and abstractions for creating code writers. You should +create a specific implementation of ``CodegenContext`` for each +generator. This can be done using a Java record, POJO, builder, etc. + +.. code-block:: java + + public record MylangContext ( + Model model, + MylangSettings settings, + SymbolProvider symbolProvider, + FileManifest fileManifest, + WriterDelegator writerDelegator, + List integrations, + ServiceShape service + ) implements CodegenContext {} + +``DirectedCodegen#createContext`` is responsible for creating a +``CodegenContext``. Ensure that the data provided by your ``CodegenContext`` +are available using the data available to ``CreateContextDirective``. + + +Tips for using ``DirectedCodegen`` +================================== + +1. Each directive object provided to the methods of a DirectedCodegen + implementation provide all the context needed to perform that action. +2. In addition to context, directives often provide helper methods to get + information out of the model or shape being generated. +3. If additional data is needed in a given directive, you can: + + 1. Add new getters to your ``CodegenContext`` class. + 2. Add state to your ``DirectedCodegen`` class to set the context + data you need. + + +.. _source code for DirectedCodegen: https://github.com/awslabs/smithy/blob/main/smithy-codegen-core/src/main/java/software/amazon/smithy/codegen/core/directed/DirectedCodegen.java diff --git a/docs/source-2.0/guides/codegen/index.rst b/docs/source-2.0/guides/codegen/index.rst new file mode 100644 index 00000000000..44eb4655d22 --- /dev/null +++ b/docs/source-2.0/guides/codegen/index.rst @@ -0,0 +1,132 @@ +-------------------------------- +Creating a Smithy Code Generator +-------------------------------- + +This guide describes how to structure a new Smithy code generator so that +the generator can be used to build generic clients for any web service +modeled with Smithy, how to build shape code generators that are reusable +in a client and server context, and how to ensure a proper separation +between generic code generation and AWS-specific code generation. + +.. note:: + + This is a living document. Important content might be missing, and + the guidance provided here may change over time. + + +Smithy's Java reference implementation +====================================== + +This guide is tailored to `Smithy's Java reference implementation`_ and +Gradle_ as a build tool, but much of the guidance is applicable to +implementations in other languages as well. The reference implementation +includes various abstractions that code generators can use to reduce the +development effort needed to build a new code generator. + + +Pluggable codegen +================= + +Just like Smithy models, Smithy code generators need to be pluggable and +extensible. The code generator needs to be able to react to traits found +in the model and influence the generated code and related artifacts like +dependency graphs. For example, if the :ref:`aws.auth#sigv4-trait` is found +on a service, a code generator should look for a codegen plugin that adds +support for signing requests using AWS SigV4. Codegen plugins need to be +able to influence the dependencies of a client, the client configuration +options exposed by the client, the interceptors used by a client, and how +the client serializes and deserializes shapes. + + +Goals of this guide +=================== + +1. Define requirements of a Smithy code generator and recommendations + on how to meet those requirements. +2. Define a recommended project layout and deliverables. +3. Define clear boundaries between generic code generation and AWS-specific + code generation to avoid coupling. +4. Increase consistency across implementations, making it easier to + contribute changes to multiple generators. + + +Non-Goals of this guide +======================= + +1. Remove all ambiguity on how to build a Smithy code generator. Each + codegen project is unique because each target language is unique. +2. Force specific implementation details. This guide is non-normative. + You're free to implement a code generator in any language using any + tooling you want. +3. Document the Smithy specification. This is supplementary content to + help guide code generators and is not intended to restate what is + already defined by the specification. + + +Tenets for Smithy code generators +================================= + +These are the tenets of Smithy code generators +(unless you know better ones): + +1. **Smithy implementations adhere to the spec**. The Smithy spec and model + are the contract between clients, servers, and other implementations. + A Smithy client written in any programming language should be able to + connect to a Smithy server written in any programming language without + either having to care about the programming language used to implement + the other. +2. **The code Smithy generate is familiar to developers**. Language idioms + and developer experience factor in to how developers and companies + choose between Smithy and alternatives. +3. **Components not monoliths**. We write modular components that + developers can compose together to meet their requirements. Our + components have clear boundaries: adding a dependency on an AWS protocol + does not require a client to use AWS credentials; Smithy code generator + do not depend on AWS SDK libraries. +4. **Developers trust the code Smithy generate**. Generated code is valid + without the need to manually edit or further transform it, it is + readable and easy to understand, and it does the right thing by + default. We avoid breaking changes to generated code outside of major + version bumps. +5. **Our code is maintainable because we limit public interfaces**. We + limit the dependencies we take on. We don't expose overly open + interfaces that hinder our ability to evolve the code base. +6. **No implementation stands alone**. Test cases, protocol tests, code + fixes, and missing abstractions have a greater impact if every Smithy + implementation can use them rather than just a single implementation. +7. **Service teams don't need to know the details of every code + generator that exists or will ever exist**. When modeling a service, + service teams only need to consider if the model is a valid Smithy + model; the constraints of any particular programming language should + not be a concern when modeling a service. Smithy is meant to work + with any number of languages, and it is an untenable task to attempt + to bubble up every constraint, reserved word, or other limitations to + modelers. + + +Navigation +========== + +.. toctree:: + :maxdepth: 1 + + overview-and-concepts + mapping-shapes-to-languages + creating-codegen-repo + configuring-the-generator + implementing-the-generator + making-codegen-pluggable + generating-code + decoupling-codegen-with-symbols + using-the-semantic-model + +.. TODO: Testing doc topics: +.. integration testing, protocol tests, using examples as tests. + +.. TODO: client topics: +.. Generating a client interface, configuration, interceptors, +.. observability, Smithy reference architecture, paginators, +.. waiters, endpoint resolution + +.. _Smithy's Java reference implementation: https://github.com/awslabs/smithy +.. _Gradle: https://gradle.org diff --git a/docs/source-2.0/guides/codegen/making-codegen-pluggable.rst b/docs/source-2.0/guides/codegen/making-codegen-pluggable.rst new file mode 100644 index 00000000000..34e73ba84d3 --- /dev/null +++ b/docs/source-2.0/guides/codegen/making-codegen-pluggable.rst @@ -0,0 +1,360 @@ +------------------------ +Making Codegen Pluggable +------------------------ + +This document describes various code generation and runtime concepts +that can be used to make Smithy code generators open to extension. + + +Why make codegen extensible? +============================ + +Smithy code generators need to be extensible so that optional features +can be contributed to augment generated code. For example, Smithy code +generators can generate generic clients that know how to send requests +to an endpoint, but AWS SDK code generators resolve endpoints based on +other configuration settings like regions. Smithy code generators should +have no built-in concept of "region", and instead they should rely on +codegen *integrations* that can augment generated code based on the +presence of traits and configuration found in smithy-build.json files. + + +Integrations +============ + +*Integrations* are the primary abstraction used to customize Smithy code +generators. Integrations are found on the classpath using +Java Service Provider Interfaces (:term:`SPI`) and are used to customize +Smithy code generators. + + +What can integrations customize? +-------------------------------- + +Various aspects of a Smithy code generator can be customized. For example: + +- Generate custom files like licenses, readmes, etc. +- Preprocess the model (e.g., validate that the model uses only features + supported by the generator, remove unsupported features, add codegen + specific traits, etc) +- Add parameters used to configure a client (e.g., constructor + arguments, builder parameters, etc) +- Inject interceptors into the client automatically (based on traits or + opt-in flags) +- Inject custom client or server HTTP request headers +- Contribute available protocol implementations that a generator can + choose from when generating clients or servers +- Contribute authentication scheme implementations that a generator can + choose from when generating clients or servers +- Intercept and modify sections of generated code (this feature is part + of :term:`AbstractCodeWriter`) +- Add dependencies either unconditionally or based on the presence of + shapes and traits in the model +- Modify the :term:`SymbolProvider` used to convert shapes into code + (e.g., add custom reserved words, change how names are generated, etc.) +- Add custom retry-strategies + + +Only customize through opt-in +----------------------------- + +Simply finding an integration on the classpath should not enable the +integration. Integrations should only be enabled through opt-in signals. +Traits found in the model and feature configuration in smithy-build.json +are used to enable customizations performed by integrations. + + +Creating a ``SmithyIntegrations`` +================================= + +Smithy codegen provides pre-built integration interface, +`SmithyIntegration `__, +that *should* be used by every Smithy code generator. Using this +standardized interface ensures all code generators follow the same basic +framework and makes it easier to contribute features that span multiple +code generators. + +``SmithyIntegration`` requires a few generic type parameters: + +.. code-block:: java + + SmithyIntegration, + C extends CodegenContext> + +- ``S``: The settings object used to configure the code generator. This + object should be a basic POJO or `Java + record `__ + that captures the same properties used by smithy-build.json files to + configure the generator. For example, this object might contain the + service shape ID being generated, a specific protocol shape ID to + generate, the code generation mode (client or server), etc. +- ``W``: The specific subclass of ``SymbolWriter`` that is used by the + code generator. For example, if generating Python code, you should + create a ``PythonWriter`` and supply that as the type parameter. +- ``C``: The ``CodegenContext`` object used by the generator. This type + depends on codegen settings. It provides integration methods access + to the model being generated, the settings object, the + ``SymbolProvider`` used to convert shapes to ``Symbol``\ s, and a + ``FileManifest`` that's used to write files to disk. Each + implementation is expected to create a specific subtype of + ``CodegenContext``. + +Example of a custom ``SmithyIntegration`` for Python: + +.. code-block:: java + + interface PythonIntegration extends + SmithyIntegration {} + +Example codegen settings type: + +.. code-block:: java + + record PythonSettings(ShapeId service, ShapeId protocol); + // A builder pattern could be applied later if the number of arguments grows. + +Example of a custom ``CodegenContext``: + +.. code-block:: java + + record PythonContext( + Model model, + PythonSettings settings, + SymbolProvider symbolProvider, + FileManifest fileManifest, + WriterDelegator writerDelegator, + List integrations + ) implements CodegenContext {} + +This integration is then implemented to implement customizations: + +.. code-block:: java + + public final class AddCodeLicense implements PythonIntegration { + // implement overrides, detailed below + } + + +Identifying integrations +------------------------ + +Integrations are identified using the ``SmithyIntegration#name()`` +method. This method will return the canonical class name of the +integration by default, but it can be overridden to provide a different +name. Note that naming conflicts between integrations are not allowed. + + +How integrations are ordered +---------------------------- + +Integrations are ordered using a kind of priority ordered dependency +graph. Integrations can specify that they should be applied before other +integrations by name and/or after other integrations by name. The +following example states that the integration needs to run before +``"Foo"`` but after ``"CodeLicenseHeader"``: + +.. code-block:: java + + @Override + public List runBefore() { + return List.of("Foo"); + } + + @Override List runAfter() { + return List.of("CodeLicenseHeader"); + } + +In rare cases, you might need more granular control over the order of +an integration. A priority can be provided to influence when the integration +is applied relative to other integrations when their dependencies are +resolved. The higher the priority, the earlier an integration is applied. + +.. code-block:: java + + @Override + public byte priority() { + return 64; + } + +.. tip:: + + :ref:`directedcodegen` automatically handles finding integrations on + the classpath and topologically ordering them. + + +Preprocessing models +-------------------- + +A common requirement of code generators is to preprocess the model. For +example, a generator that doesn't support :ref:`event streams ` +might want to filter out event stream operations and emit warnings. +A generator could also choose to apply synthetic traits (traits that are not +persisted when the model is serialized) to shapes in the model as part of +their code generation strategy. + +The model can be preprocessed by implementing the +``SmithyIntegration#preprocessModel`` method and returning an updated +model. + +.. code-block:: java + + @Override + Model preprocessModel(Model model, PythonSettings settings) { + // perform some transformation and return the updated model. + return model; + } + +.. seealso:: :ref:`codegen-transforming-the-model` + + +Changing how shapes are named or how files are generated +-------------------------------------------------------- + +Another requirement when generating code might be to change the strategy +used for naming shapes, the file location of shapes, or just adding +metadata to each :term:`Symbol` created by a :term:`SymbolProvider`. This can +be achieved by implementing ``SmithyIntegration#decorateSymbolProvider``: + +.. code-block:: java + + @Override + public SymbolProvider decorateSymbolProvider(Model model, PythonSetting settings, SymbolProvider symbolProvider) { + // Decorate the symbol provider and add a "foo" property to every symbol. + return shape -> symbolProvider.toSymbol(shape) + .toBuilder() + .putProperty("foo", "hello!") + .build()); + } + + +.. _codegen-intercepting: + +Intercepting and updating sections of code +------------------------------------------ + +Code generators can designate sections of code that can be modified +by integrations. This feature allows integrations to do things like add +text to every code file (for example a license header), apply +annotations to generated types, change the type signature of a class, +change how classes are created, etc. Implementations of ``CodeInterceptors`` +registered with ``SmithyIntegration``\ s must be added to each code +writer created during code generation. + +Let's say you wanted to emit a customizable section in generated code +where headers for the code could be modified to add a custom license +header or disclaimer that the code is generated. This can be achieved by +first creating an implementation of ``CodeSection``. We'll call it +``CodeHeader``: + +.. code-block:: java + + // This event does not need any properties. + record CodeHeader() implements CodeSection; + +When generating code, inject the section at the start of each file: + +.. code-block:: java + + mywriter.injectSection(new CodeHeader()); + +``injectSection`` is used because this section of code is empty by default. +If the section should have content by default, then use ``pushSection`` and +``popSection``: + +.. code-block:: java + + mywriter.pushSection(new CodeHeader()); + mywriter.write("// This is generated code"); + mywriter.popSection(); + +The call to ``injectSection`` implicitly calls ``popSection``. When +``popSection`` is called, the code that was written during that section is +sent to each matching interceptor so that they can prepend to the contents, +append to the contents, or completely rewrite the contents. + +``CodeInterceptor``\ s can be registered to append to this section by +returning interceptors from ``SmithyIntegration#interceptors``: + +.. code-block:: java + + @Override + public List> + interceptors(C codegenContext) { + return List.of(new CodeHeaderInterceptor()); + } + +Interceptors should be created as a dedicated class. The following interceptor +appends to the existing content in the section: + +.. code-block:: java + + final class CodeHeaderInterceptor extends CodeInterceptor.Appender { + @Override + public Class sectionType() { + return CodeHeader.class; + } + + @Override + public void append(PythonWriter writer, CodeHeader section) { + writer.write(""" + /* + * Copyright 2023 example.com, Inc. or its affiliates. All Rights Reserved. + */ + """); + } + } + + +Generating other custom content +------------------------------- + +Integrations might need to write additional files like a README, license +files, or generate additional code. Integrations can override the +``SmithyIntegration#customize`` method to perform anything they need to +do. This method is provided the ``CodegenContext`` type that is used +with the integration, allowing the ``customize`` method access to the +model, settings object, symbol provider, ``WriterDelegator``, and +``FileManifest`` used to save and read files. + +The following example writes a custom README.md file: + +.. code-block:: java + + @Override + public void customize(PythonContext context) { + context.writerDelegator().useFileWriter("README.md", writer -> { + writer.write(""" + # $L service client + Client SDK library ...""", + context.settings().service() + ); + }); + } + + +Registering ``SmithyIntegration``\ s +==================================== + +Implementations of Integrations are registered with Java :term:`SPI` by +adding a specific ``META-INF`` file and found on the classpath. For example, +if the integration class is defined as +``software.amazon.smithy.python.client.PythonIntegration``, then when using +:term:`Gradle`, the fully qualified class name of each implementation of the +integration needs to placed in a file named +``src/main/resources/META-INF/services/software.amazon.smithy.python.client.PythonIntegration``. + +For example: + +.. code-block:: python + + # in src/main/resources/META-INF/services/software.amazon.smithy.python.client.PythonIntegration + software.foobaz.AddCodeLicense + + +Using ``SmithyIntegration``\ s in generators +============================================ + +:ref:`directedcodegen` automatically handles finding integrations on the +classpath, topologically ordering them, and applying each integration method at +the appropriate point of code generation. diff --git a/docs/source-2.0/guides/codegen/mapping-shapes-to-languages.rst b/docs/source-2.0/guides/codegen/mapping-shapes-to-languages.rst new file mode 100644 index 00000000000..1ece02f3029 --- /dev/null +++ b/docs/source-2.0/guides/codegen/mapping-shapes-to-languages.rst @@ -0,0 +1,629 @@ +-------------------------------------- +Mapping Smithy Shapes to Your Language +-------------------------------------- + +One of the first design documents to write is how shapes in Smithy +will map to types in your target environment. The way shapes map to a +target environment can also vary depending on if you are generating a +client or server. + + +Interoperability +================ + +When determining how shapes are represented in a target environment, adhering +to the Smithy specification is a hard requirement and the first of Smithy's +tenets. This implies two practical considerations to keep in mind: + +1. Generators cannot create a client that will break backward + compatibility if a service team makes a backward compatible change + according to the Smithy specification. Any deviation from this should + require opt-in from the end-user and emit warnings. +2. Code generators should not enforce their own restrictions on top of + the restrictions defined in the Smithy model. For example, if a + particular identifier is a reserved word in the target programming + language, the code generator should automatically modify the + identifier to deconflict it with the reserved word. + + +Smithy shapes +============= + +This section identifies the non-exhaustive shapes and traits that code +generators need to account for (additional details on each type of shape +and trait can be found in the Smithy specification). + +.. note:: + + When mapping Smithy shapes to a target environment, you may + decide that the abstractions provided in the standard library of the + target environment aren't ergonomic enough or don't map well to Smithy. + In these cases, you can provide your own abstractions. For example, the + AWS SDK for Java created an `SdkBytes`_ class to make it easier to + provide the contents of a blob to the SDK. + + +When to Generate unique named types +----------------------------------- + +The following shapes *should not* generate uniquely named types based on the +name provided in a model: + +* blobs +* booleans +* strings +* numbers +* documents +* lists +* maps + +When possible, the above types should use the target environment's equivalent +language built-in type (for example, a Smithy string would become a +Java ``String```). Creating Smithy-specific types where an idiomatic language +built-in type is available hurts the developer experience of the generator +and should be avoided. Furthermore, changing the names of these types in the +Smithy model should not impact generated code. + +The following shapes *are* expected to translate into named generated types +or methods in the target environment: + +* services +* operations +* structures +* unions +* structure and union member names +* enums +* intEnums +* enum and intEnum member names + + +Blob +---- + +Blobs represent opaque binary data. There are three kinds of blobs: + +1. Blobs that are expected to fit into memory. These shapes are not + marked with the :ref:`streaming-trait`. Such blobs should be + represented as a kind of byte array that is stored in memory + (for example, ``byte[]`` in Java, or a string in PHP). +2. Unbounded blobs that are not expected to fit into memory. These + blobs are marked with the ``streaming`` trait and should be + represented using some kind of streaming abstraction that + can work with potentially infinite streams of data. +3. Bounded blobs that are not expected to fit into memory. These + blobs are marked with both the ``streaming`` trait and the + :ref:`requireslength-trait`, which implies that the stream has + some method for "telling" callers its length. + + +Boolean +------- + +Boolean shapes in Smithy represent true or false values. These should +always map to a language's standard Boolean type. + + +Document +-------- + +The document type represents untyped data. Document types by default are +JSON-like values that can be set to a string, number, boolean, list, map, +or null. Document types are generally used for truly untyped data that +users are expected to dynamically inspect at runtime. + +.. note:: + + Document types are limited to the JSON data model for now; however, + future support for other type systems *could* be added to document + types (for example, a CBOR document). + + +String +------ + +Strings should be represented using string types from the target +environment's standard library when possible. + +- Avoid creating custom string type for Smithy code generators unless + it's absolutely necessary. +- Never generate specific types for normal strings — the name of a + normal string shape is irrelevant and should not appear in generated + code. +- Code used to represent strings must be able to losslessly round-trip + UTF-8 data. Don't create a custom string type if your programming + language represents strings as bytes (like in PHP) or uses UTF-16 + (like Java). However, if the string type in a given language can only + contain, say, ASCII characters, then you should use some kind of byte + array to represent strings. + +Strings in Smithy can be marked with the :ref:`enum-trait`; however, +Smithy code generators should transform models prior to code generation +to convert these kinds of strings to proper ``enum`` shapes. + + +Enums +----- + +Smithy IDL V2 introduced a proper :ref:`enum` shape that obsoletes the +:ref:`enum-trait`. Enums define a set of allowed string values that can be +provided for the shape. + +.. important:: + + Because client implementations often lag behind service, clients + *must not* fail to deserialize and serialize unknown enum values. + For example, implementations could use a kind of discriminated union + with a catch-all unknown value placeholder, provide additional + accessor methods to retrieve the raw string value of an enum, + or some other technique to carry unknown values. + +Consider the following Smithy model: + +.. code-block:: smithy + + enum Suit { + DIAMOND + CLUB + HEART + SPADE + } + +It could be generated as the following enum in Rust: + +.. code-block:: rust + + #[non_exhaustive] + enum Suit { + DIAMOND, + CLUB, + HEART, + SPADE, + + #[non_exhaustive] + Unknown(String) + } + +Notice that unknown enum variants are captured in ``Unknown``, along +with the unknown value. This allows the enum type to store newly added +values that the client doesn't yet know about. Also note that the enum +is ``non_exhaustive``, because new enum values can be added in the +future, and we want consumers of the generated code to account for this. + +Smithy also supports :ref:`intEnum`. It's just like an ``enum`` but is an +enum of integer values. ``intEnum`` shapes must also support sending and +receiving unknown integer values to account for newly added enum +members. For example: + +.. code-block:: smithy + + intEnum FaceCard { + JACK = 1 + QUEEN = 2 + KING = 3 + ACE = 4 + JOKER = 5 + } + + +Timestamp +--------- + +A :ref:`timestamp` shape represents an instant in time with no UTC +offset or timezone. For example, to represent a timestamp in Java, +you would use `java.time.Instant`_ and not `java.time.OffsetDateTime`_ +because a timestamp has not UTC offset. + +The serialization format of a timestamp is an implementation detail +determined by a :ref:`protocol ` and must not +have any effect on the types exposed by tooling to represent a timestamp +value. If a timestamp in one service is serialized as a string and in +another service as an integer, the type exposed by the code generator to +represent these timestamps must be exactly the same type. Put another +way: changing the protocol and serialization format of a timestamp should +not break previously generated code. + + +Numbers: byte, short, integer, long, float, double, bigInteger, bigDecimal +-------------------------------------------------------------------------- + +Smithy supports various numeric types. If a target environment does not +support smaller types like byte, short, or float, then these types +should be rolled into the next largest supported numeric type (e.g., +byte → integer, short → integer, float → double). + +If a target environment does not support :ref:`bigInteger` (an arbitrary +precision integer) or :ref:`bigDecimal` (an arbitrary precision decimal), +then a library dependency should be used iff one of these types are +encountered or the runtime library of the generator should provide an +implementation. + +.. note:: + + If a library is needed to provide support for larger numeric types, + then the library should only be required conditionally if the type is + used in the service closure. This can be handled automatically using + Smithy symbol and symbol dependency abstractions, or by crawling the + shapes in a service closure to detect specific types. + + +List +---- + +The *list* type represents an ordered homogeneous collection of values. +A list type should be code generated using the list or array type +provided in the standard library of the target environment. + +.. rubric:: Value presence + +* List values are always present (non-nullable) unless the list is marked + with the ``@sparse`` trait. + + +Ignore set shapes from Smithy 1.0 +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The ``set`` type was deprecated in Smithy 1.0 and removed in Smithy +2.0. Smithy model implementations should automatically add the +:ref:`uniqueItems-trait` to set shapes, and code generators should +treat set shapes exactly like list shapes marked with ``uniqueItems``. + +.. note:: + + When using the Smithy Java reference implementation, the + ``uniqueItems`` trait is automatically added to set shapes, and the + class used to represent set shapes, SetShape, extends from ListShape, + allowing you to ignore the difference between list and set shapes + altogether. + + +Map +--- + +The :ref:`map` type represents a map data structure that maps +``string`` keys to homogeneous values. Maps are not required to +maintain insertion order. Implementations should use the idiomatic +map data structure of the target environment when possible. + +.. rubric:: Key and value presence + +* Map keys are always present and never nullable. +* Map values are always present (non-nullable) unless the map is + marked with the :ref:`sparse-trait`. + + +Structure +--------- + +The *structure* type represents a fixed set of named, heterogeneous +values. Structures are always code generated and use the name provided +in the model. Structures are generally code generated into things like +POJOs, POCOs, etc. Smithy IDL v2 will support things like default zero +values used to initialize values to things like empty lists, 0, empty +maps, etc. Some target environments allow types to be created using a +kind of literal syntax that does not perform any custom initialization. +In these cases, it may be necessary to use a constructor method in order +to set members to their default zero values if needed. + +Structure and union members are ordered based on the order they are +defined in the model. When adding new members, they should be added to +the end of the structure. While this allows code generators like C++ to +maintain ABI compatibility, it requires extreme levels of rigor to +enforce that every change will be ABI compatible. + + +Error structures +~~~~~~~~~~~~~~~~ + +Structures marked with the ``@error`` trait should be code generated as +a kind of error type or exception type in the target environment. A good +design goal for errors generated from Smithy models is to allow generic +abstractions to work across generated Smithy clients. For example, +developers should be able to create a middleware that can be used with +any Smithy generated client to check if an error is a client error, +server error, retryable, or throttling error. The :ref:`retryable-trait` +is used to describe if an error can be retried, and the ``throttling`` +property of this trait describes if the error is due to throttling. This +information should be exposed by the generated type in some way. + +Errors could have a kind of hierarchy resembling the following (note +that other error conditions like networking errors need to be accounted +for as well): + +- Service specific error: a top-level error type generated specifically + for every error the service can return. This error is used when an + unmodeled exception is encountered. +- Client Error: Error used when an ``@error`` trait is set to "client". +- Server Error: Error used when an ``@error`` trait is set to "server". + + +Union +----- + +The union type represents a `tagged union data +structure `__ that can take +on several different, but fixed, types. Unions function similarly to +structures except that only one member can be used at any one time. +Unions are always code generated and use the name provided in the model. +Code generators should provide some kind of abstraction to make union +types easier to use. For example, if a target environment supports sum +types or discriminated unions, use them. Sealed classes with specific +subtypes for each variant of the union are also good options. + +- The member that is set in a union cannot be optional. +- There must be exactly one member of the union set to a non-null + value. +- Clients must account for unknown union values by storing the name of + the unknown variant. + + +Unit types in unions +~~~~~~~~~~~~~~~~~~~~ + +Union members may target Smithy's built-in unit type, +``smithy.api#Unit``, meaning the variant has no meaningful value. If a +unit type member targets the unit type, implementations should generate +code that omits the value for that variant or sets the value to a +specific type (e.g., ``Void`` in Java). You can detect if a member +targets the Unit type using the following: + +.. code-block:: java + + import software.amazon.smithy.model.traits.UnitTypeTrait; + + for (MemberShape member : unitShape.members()) { + if (member.getTarget().equals(UnitTypeTrait.UNIT)) { + // Generate special code to handle unit types. + } else { + // The member is a normal shape. + } + } + + +Service +------- + +A *service* is the entry point of an API that aggregates resources and +operations together. The service shape will tell you which protocols a +service supports, which auth schemes it supports, the operations of the +service, and the resources contained in the service. + + +Computing a service closure +~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +The closure of shapes connected to a service are the shapes that will be +code generated. You can compute this closure using a +`Walker `__: + +.. code-block:: java + + Walker walker = new Walker(someModel); + Set closure = walker.walkShapes(someService); + +You can get the entire set of operations contained in a service using a +``TopDownIndex``: + +.. code-block:: java + + TopDownIndex index = TopDownIndex.of(model); + Set operations = index.getContainedOperations(someService); + +.. tip:: + + :ref:`directedcodegen` automatically handles this for you. + + +Service renames +~~~~~~~~~~~~~~~ + +Services might need to "rename" shapes in order to disambiguate shapes +that share the same name. This is done so that namespaces in the Smithy +model do no need to have a 1:1 namespace mapping in generated code. When +determining the name of a shape for use in codegen, never rely on the +shape ID directly, but rather first check if the shape was renamed +within the closure of a service. This can be done by passing a +``ServiceShape`` into ``ShapeId#getName``: + +.. code-block:: java + + // Good! + String goodCodegenName = someShapeId.getName(someServiceShape); + + // Bad! + String badCodegenName = someShapeId.getName(); + + +Operation +--------- + +The *operation* type represents the input, output, and possible errors +of an API operation. + + +Generating unique input and output shapes +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +Client code generators must generate distinct types for all operation +input and output shape structures. Members of an input structure should +all be treated as optional regardless of if the member is marked with +the ``@default`` trait or ``@required`` trait. This allows service teams +to evolve their API without breaking previously generated clients. + +A more recent feature of Smithy allows marking structures as specific to +the input or output of an operation using ``@input`` and ``@output`` +traits. You can transform the model being code generated and create +synthetic input and output shapes when necessary using the +`createDedicatedInputAndOutput `_ +model transformer. The following example creates a new Model that has +dedicated input and output shapes for every operation, each marked with +the ``@input`` or ``@output`` trait, and each uses a consistent name +that ends with ``Input`` or ``Output``. + +.. code-block:: java + + ModelTransformer transformer = ModelTransformer.create(); + Model transformed = transformer.createDedicatedInputAndOutput( + "model", "Input", "Output" + ); + +.. tip:: + + :ref:`directedcodegen` automatically handles this for you. + + +Resource +-------- + +A *resource* is an entity with an identity that has a set of operations. + +Resources add hierarchy to a model. You will need to traverse from a +service to every operation and resource in order to crawl the entire +service. This process can be simplified using the +`TopDownIndex `__. +Iterating only over the operations attached to a service will not +provide every operation in the closure of the service. + +.. note:: + + Exposing resource abstractions through code, if attempted, should + be done in addition to a more traditional, flattened, service + interface with every operation contained in the service. + + +Other shape topics +================== + +Shapes can be recursive +----------------------- + +Smithy :ref:`shapes support recursion `. +Some languages like Rust require the size of types to be known at +compile time. Recursive types in these languages need some kind of heap +allocation to ensure they have a size known at compile time. In order to +identify which member in a recursive loop needs to be heap allocated, +implementations will need to utilize a topological sort. Smithy's +:ref:`directedcodegen` abstraction will automatically generate code based +on a topological sort, though generators that need more control over how to handle recursion will need to manually +use a `TopologicalIndex `__. + +.. seealso:: + + `Rust design doc `__ + for how they handled recursive shapes. + + +Mixins are an implementation detail of the model +------------------------------------------------ + +Mixins are considered an implementation detail of a model and should not +impact code generation. Code generators should transform the model prior +to code generation to remove mixins. + +.. seealso:: + + :ref:`Flattening mixins ` + + +Member optionality +------------------ + +Smithy has different rules around when a member is always present or +optional. The rules around nullability are defined in the Smithy +specification. However, all of this complexity is accounted for +automatically using the +`NullableIndex `_. + +.. code-block:: java + + NullableIndex index = NullableIndex.of(model); + if (index.isNullable(someMember)) { + // optional + } else { + // always present + } + + +FAQ +=== + +Should constraint traits impact generated types? +------------------------------------------------ + +In general, no. Baking anything about these traits into generated types +makes these traits impossible to change in the future without breaking +previously generated code. + +- Length and pattern traits should have no impact on generated types. + Shapes with a length or pattern trait should be represented as a + standard string type. +- Range traits must have no impact on generated types. Code generators + must not rely on range traits to determine which numeric type is best + for representing a Smithy shape. Instead, code generators must rely + on the Smithy type used for the numeric shape (integer, long, etc). +- The ``@enum`` trait was replaced by the ``enum`` shape in Smithy IDL + 2.0. Both the ``@enum`` trait and ``enum`` shape can influence code + generation, though they must be considered a specialization of + strings. This allows servers to add new enum values over time without + breaking previously generated clients. +- The ``@required`` trait is no longer considered a constraint trait in + Smithy IDL 2.0. It is now expected to influence code generated types. + + +Should clients enforce constraint traits? +----------------------------------------- + +No. A client should defer the validation of constraint traits to the +service. + + +Why don't we validate constraint traits on the client? +------------------------------------------------------ + +Validating constraint traits on the client makes it extremely hard to +change constraint traits, even in what appears to be a backward +compatible change. Changes to constraint traits need to be backward +compatible by making the constraint more relaxed. However, things fall +apart when multiple actors are creating resources using different +versions of the service. + +Years ago, many AWS SDKs validated constraint traits client-side and +refused to send non-compliant input. During that period, Amazon EC2 +updated instance IDs to support a longer string, and existing SDKs began +to break when this change was deployed. That's because the use of other +tools like the AWS Management Console or even managed services created +instances that SDKs could no longer interact with. If a client +encountered an instance created in the console and then tried to make a +subsequent call using the instance ID, the client would refuse to send +the request. This resulted in a large amount of customer pain and took +months of effort to correct. If the client ignored constraint traits and +allowed the service to enforce them, the change would have been +transparent to previously generated clients. + + +Do generators need to worry about mixins? +----------------------------------------- + +No. Mixins are a modeling abstraction and not something that is meant to +be reflected in generated code. Mixins should be flattened before +generating code. This is something that :ref:`directedcodegen` +can handle for you. + +.. seealso:: + + :ref:`Flattening mixins ` + + +Is there an easier way to account for errors of operations inheriting service errors? +------------------------------------------------------------------------------------- + +Yes. You can flatten error hierarchies before generating code. This is +also something that :ref:`directedcodegen` can handle for you. + +.. seealso:: + + :ref:`Copying service errors to operation errors ` + + +.. _SdkBytes: https://github.com/aws/aws-sdk-java-v2/blob/master/core/sdk-core/src/main/java/software/amazon/awssdk/core/SdkBytes.java +.. _java.time.Instant: https://docs.oracle.com/javase/8/docs/api/java/time/Instant.html +.. _java.time.OffsetDateTime: https://docs.oracle.com/javase/8/docs/api/java/time/OffsetDateTime.html diff --git a/docs/source-2.0/guides/codegen/overview-and-concepts.rst b/docs/source-2.0/guides/codegen/overview-and-concepts.rst new file mode 100644 index 00000000000..9e639c3b09d --- /dev/null +++ b/docs/source-2.0/guides/codegen/overview-and-concepts.rst @@ -0,0 +1,230 @@ +--------------------- +Overview and Concepts +--------------------- + +This section provides an overview of what a code generator is, a high +level overview of how to build a code generator, and introduces the +concepts that are used when building a Smithy code generator. + + +What you're building +==================== + +You're building one or more Smithy-Build codegen plugins to generate a +client, server, and types from Smithy models for a *target environment*. +A *target environment* is the intended programming language and specific +runtime environment. + +*Smithy-Build* is a tool used to transform Smithy models into artifacts +like code or other models. Smithy-Build plugins are implemented to +perform these transformations. Codegen plugins are configured with +*smithy-build.json* files, implemented in Java, Kotlin, or another JVM +language, and discovered on the classpath through Java Service Provider +Interfaces (SPI). + +Consider the following smithy-build.json file: + +.. code-block:: json + + { + "version": "1.0", + "plugins": { + "foo-client-codegen": { + "service": "smithy.example#Weather", + "package": "com.example.weather", + "packageVersion": "0.0.1", + "edition": "2022" + } + } + } + +This file tells Smithy-Build to generate a hypothetical Foo language +client using the ``foo-client-codegen`` plugin found on the classpath. + +.. seealso:: + + - `Smithy Gradle plugin `__ + - `DirectedCodegen `__ + to more easily implement codegen + - :doc:`configuring-the-generator` + + +Design first, generate second +============================= + +The first step to writing a code generator is to *not* write the code +generator. The code generator is an implementation detail. The first you +need to decide on what code you want to generate. Pick a Smithy model and +manually map each concept of the model to hand-written code. In fact, +nearly every aspect of the product you intend to eventually generate can +be hand-written as a proof of concept before writing any of the code +generator. Things to consider are: + +1. How will Smithy types map to types in your programming language? +2. What will the client and server interfaces look like for a modeled service? +3. How are clients and servers created and configured? +4. How will you allow the client or server to be customized at runtime? + + +Design documents +================ + +You are encouraged to document major design decisions to explain why design +choices were made and leave a record for future contributors. + +Example Smithy codegen design documents: + +- https://awslabs.github.io/smithy-rs/design/ +- https://github.com/awslabs/smithy-kotlin/tree/main/docs/design +- https://github.com/awslabs/aws-sdk-kotlin/tree/main/docs/design +- https://github.com/awslabs/smithy-ruby/wiki + + +Phases of code generation +========================= + +There are three phases of code generation. + +1. **Codegen-time**: The phase in which code is being generated for the + target environment. + + * Depends on Smithy models + * Typically written in Java if using the Smithy reference implementation + * Uses Smithy-Build + * Generates code + * Generates dependencies +2. **Compile-time**: Performed in the target environment to compile and/or + verify generated code. This phase may be optional for languages that + aren't compiled. However, linting and static analysis are also considered + part of the compile-time phase. +3. **Runtime**: generated code is run in the target environment. The Smithy + model, Java, and Smithy reference implementation are not required at + runtime. + + +Runtime libraries +================= + +The libraries and code that are used to power a client, server, +serialization, and deserialization are called *runtime libraries*. The +code generator needs to have prior knowledge of these libraries and how +to call them. The code generated by a code generator is expected to +automatically work based on the Smithy model the code was generated +from. For example, if the model contains a service shape marked with the +:ref:`aws.auth#sigv4-trait` for auth, then the generated code should be +configured to use `AWS Signature version 4`_ and have a dependency on any +necessary libraries for the target environment. + +Deciding on the libraries you use, which dependencies you take, and what +public interfaces you expose is part of the design phase of both the +generator and runtime libraries. The runtime libraries can be designed +separately from the code generator, but there does need to be some +consideration given to how a code generator will configure and compose +runtime components at codegen-time. + + +You don't need Smithy models at runtime +======================================= + +Smithy code generators should utilize `model-ignorant code generation`_, +a method of generating code that does not require the models the code +was generated from to be available at runtime. This makes the Smithy +model itself an implementation detail to the generated code, and it +removes the need to write a Smithy implementation in the target +environment. Code generated from Smithy models does not need the Smithy +model at runtime because things like routing, serialization, +deserialization, and orchestration can all be generated at codegen-time. +If any elements of the Smithy model need to be made available at runtime, +they can be made available using other language-specific mechanisms like +Java annotations, Rust attributes, interfaces, etc. + + +Client, server, and type code generation +======================================== + +Smithy code generators should be able to generate clients, servers, and +types. Each of these use cases should be served by a different +``smithy-build.json`` plugin, though they should all rely on a shared +implementation. For example, here's how service code generation could be +configured for a Java code generator: + +.. code-block:: json + + { + "version": "1.0", + "projections": { + "source": { + "plugins": { + "java-server-codegen": { + "service": "com.bigco.example#Example", + "package": "com.bigco.example", + "packageVersion": "0.0.1", + "edition": "2022" + } + } + } + } + } + + +Client generation +----------------- + +All Smithy implementations should generate clients. + +- This is where most code generators should start. +- Clients generated from a model should not use the exact same types + and interfaces as a service generated from a model. This is + because (1) many Smithy services use *projections* to generate + clients, and the projections often have features removed that are + internal-only or available to a subset of customers. (2) + servers are *authoritative;* they have perfect knowledge of the + model and can generate stricter types. Clients are + *non-authoritative* and need to guard against model updates that + are considered backward compatible (for example, adding a new + ``enum`` member). +- AWS SDKs are built on top of Smithy clients, but Smithy clients + are not AWS SDKs. Smithy clients do not require the use of AWS + protocols, signing algorithms, regions and endpoint resolution, + ~/.aws/config, etc (note that Smithy does not support a + first-party protocol **today**, so in practice most clients will + likely rely on an AWS protocol like the + ``aws-restjson1-protocol``). + + +Server generation +----------------- + +Some Smithy code generators will generate service framework code. This +can include service interfaces, stubs to implement each operation, request +deserializers, response serializers, etc. + +- If you know that your language will also provide a service + framework, it's best to start the service development while the + clients are being developed. This helps to ensure that a high + degree of code can be shared across the generators. +- Even if you don't plan on writing a service right now, it does + help to think about *how* a service code generator could be added + in a way that can reuse much of the client code generation. +- When adding features to generated types and interfaces, consider + if the feature is applicable to both client and server code. If it + isn't, then the feature should either be removed, refactored, or + added in such a way that it is only optionally generated for + clients. + + +Type generation +--------------- + +Smithy code generators can generate standalone types. For example, this +would happen when a service has no operations or resources but only shapes +bound to the service via the (upcoming) +`shapes property `__. + +- Generation of types should still require a service shape that is + used to create a closure of shapes. +- The service shape dictates the serialization formats supported by + the generated types using :ref:`protocol traits `. + +.. _AWS Signature version 4: https://docs.aws.amazon.com/general/latest/gr/signing-aws-api-requests.html +.. _model-ignorant code generation: https://www.martinfowler.com/dslCatalog/modelIgnorantGeneration.html diff --git a/docs/source-2.0/guides/codegen/using-the-semantic-model.rst b/docs/source-2.0/guides/codegen/using-the-semantic-model.rst new file mode 100644 index 00000000000..4ab701cf8e5 --- /dev/null +++ b/docs/source-2.0/guides/codegen/using-the-semantic-model.rst @@ -0,0 +1,587 @@ +------------------------ +Using the Semantic Model +------------------------ + +The Java reference implementation of Smithy provides various +abstractions to interact with the in-memory semantic model. This +document provides a kind of "cookbook" for achieving various tasks with +the Smithy model. + + +Traversing the model +==================== + +Each of the following examples assume a variable named ``model`` is +defined that is a ``software.amazon.smithy.model.Model``. + + +Iterate over all shapes +----------------------- + +``Model#toSet`` is a cheap operation that just provides a ``Set`` +view over a model. + +.. code-block:: java + + for (Shape shape : model.toSet()) { + // ... + } + + +Iterate over all shapes of a specific type +------------------------------------------ + +Each type of shape in Smithy has a dedicated ``Model#getXShapes`` +method. These methods are cheap to invoke. They just provide a +filtered ``Set`` view over a model. + +.. code-block:: java + + for (ServiceShape shape : model.getServiceShapes()) { + // ... + } + + for (StructureShape shape : model.getStructureShapes()) { + // ... + } + + for (MemberShape shape : model.getMemberShapes()) { + // ... + } + + // etc... + + +Iterate over all shapes with a specific trait +--------------------------------------------- + +``Model#getShapesWithTrait`` returns shapes that have a specific trait. +This is a cheap method to call and uses caches internally. The provided +trait class can be retrieved from each returned shape. The following +example uses ``DeprecatedTrait`` but any trait class can be used. + +.. code-block:: java + + for (Shape shape : model.getShapesWithTrait(DeprecatedTrait.class)) { + DeprecatedTrait trait = shape.expectTrait(DeprecatedTrait.class); + } + + +Iterate over shapes of a specific type with a specific trait +------------------------------------------------------------ + +``Model#getXShapesWithTrait`` returns shapes of type ``X`` that have a +specific trait. Each type of shape has a dedicated ``Model#getXShapesWithTrait`` +method. This is a cheap method to call and uses caches internally. +The provided trait class can be retrieved from each returned shape. The +following example uses ``SensitiveTrait`` but any trait class can be used. + +.. code-block:: java + + for (StructureShape shape : model.getStructureShapesWithTrait(SensitiveTrait.class)) { + SensitiveTrait trait = shape.expectTrait(SensitiveTrait.class); + } + + for (StringShape shape : model.getStringShapesWithTrait(SensitiveTrait.class)) { + SensitiveTrait trait = shape.expectTrait(SensitiveTrait.class); + } + + // etc.. + + +Stream over all shapes +---------------------- + +.. code-block:: java + + Stream strings = model.shapes(StringShape.class) + .filter(shape -> shape.getId().getNamespace().equals("foo.bar")); + +.. tip:: + + In general, prefer the named methods that convert ``Model`` to a set. + However, it's sometimes useful to break down complicated pipeline + style transformations into streams. + + +Traversing the members of a shape +--------------------------------- + +.. code-block:: java + + StructureShape struct; + + for (MemberShape member : struct.members()) { + // Get the shape targeted by the member. + Shape target = model.expectShape(member.getTarget()); + System.out.println(member.getMemberName() + " targets " + target); + + // Get that container of the member. + Shape container = model.expectShape(member.getContainer()); + } + +.. note:: + + - Members are ordered based on the order given in the Smithy model + - You can order the members differently if needed (for example sorting + them using a ``TreeMap``). + - The above code same works the same way for any shape, whether it's a + structure, union, list, set, or map. + - By the time a code generator is running, the model has been + thoroughly validated. You should use the various methods that start + with ``expect`` to more easily interact with shapes. + + +Visiting shapes +--------------- + +Smithy often relies on *visitors* to dispatch to different typed methods +for handling different kinds of shapes. + +.. code-block:: java + + // Silly example that returns the numbers of members a shape has. + ShapeVisitor visitor = new ShapeVisitor.Default() { + @Override + protected Integer getDefault(Shape shape) { + return 0; + } + + @Override + public Integer listShape(ListShape shape) { + return 1; + } + + @Override + public Integer mapShape(MapShape shape) { + return 2; + } + + @Override + public Integer structureShape(StructureShape shape) { + return shape.members().size(); + } + + @Override + public Integer unionShape(UnionShape shape) { + return shape.members().size(); + } + }; + + StringShape string = exampleThatGetsString(); + int count = string.accept(visitor); + assert(count == 0); + +.. note:: + + - The ``accept`` method of a shape is used to apply a visitor to the + shape. + - You should typically use the ``Visitor.Default`` implementation to + implement a visitor. + - A simpler way to get the answer of the above example is to just call + ``shape.members().size()``. + + +Knowledge Indexes +================= + +Smithy provides various knowledge index implementations that are used to +break down more complex tasks into easily queried, pre-computed data +stores. These knowledge indexes are also cached on a ``Model`` object, +making them cheaper to use than recomputing information multiple times +across things like validators. + + +Get every operation in a service or resource +-------------------------------------------- + +Service shapes can contain resources which can contain operations. +``TopDownIndex`` will walk the service/resource to find all contained +operations. + +.. code-block:: java + + TopDownIndex index = TopDownIndex.of(model); + index.getContainedOperations(serviceShape); + + +Get every resource in a service or resource +------------------------------------------- + +Service shapes can contain resources which can themselves contain +resources. ``TopDownIndex`` will walk the service/resource to find +all contained operations. + +.. code-block:: java + + TopDownIndex index = TopDownIndex.of(model); + index.getContainedResources(serviceShape); + + +Determine if a member is nullable +--------------------------------- + +Taking the version of the Smithy IDL into account when computing +the nullability of a member can be complex. ``NullableIndex`` +hides all of this complexity by providing a simple boolean result +for a given member shape. + +.. code-block:: java + + NullableIndex index = NullableIndex.of(model); + + if (index.isMemberNullable(someMemberShape)) { + // nullable + } + + +Get pagination information about an operation +--------------------------------------------- + +Resolving information about paginated operations in Smithy requires +some bookkeeping. ``PaginatedIndex`` tries to consolidate all the +information you might need when interacting with paginated traits. + +.. code-block:: java + + PaginatedIndex index = PaginatedIndex.of(model); + + index.getPaginationInfo(service, operation).ifPresenet(info -> { + // method invoked if the operation is paginated. + System.out.println("Service shape: " + info.getService()); + System.out.println("Operation shape: " + info.getOperation()); + System.out.println("Input shape: " + info.getInput()); + System.out.println("Output shape: " + info.getOutput()); + System.out.println("Paginated trait: " + info.getPaginatedTrait()); + System.out.println("Input token member: " + info.getInputTokenMember()); + System.out.println("Output token membber: " + info.getOutputTokenMemberPath()); + // etc... + }); + + +Get the HTTP binding response status code of an operation +--------------------------------------------------------- + +The ``HttpBindingIndex`` can provide all kinds of information about +the HTTP bindings of an operation, including the response status +code. + +.. code-block:: java + + HttpBindingIndex index = HttpBindingIndex.of(model); + int code = index.getResponseCode(operationShape); + + +Get the request content-type of an operation +-------------------------------------------- + +``HttpBindingIndex`` can attempt to resolve the Content-Type header +of a request. The content-type might not be statically known by +the model and might rely on protocol-specific information. + +.. code-block:: java + + HttpBindingIndex index = HttpBindingIndex.of(model); + + String defaultPayloadType = "application/json"; + String eventStreamType = "application/vnd.amazon.event-stream"); + String contentType = index + .determineRequestContentType(operation, defaultPayloadType, eventStreamType) + .orElseNull(); + + +Get the response content-type of an operation +--------------------------------------------- + +``HttpBindingIndex`` can attempt to resolve the Content-Type header +of a response. The content-type might not be statically known by +the model and might rely on protocol-specific information. + +.. code-block:: java + + HttpBindingIndex index = HttpBindingIndex.of(model); + + String defaultPayloadType = "application/json"; + String eventStreamType = "application/vnd.amazon.event-stream"); + String contentType = index + .determineResponseContentType(operation, defaultPayloadType, eventStreamType) + .orElseNull(); + + +Get HTTP binding information of an operation +-------------------------------------------- + +.. code-block:: java + + HttpBindingIndex index = HttpBindingIndex.of(model); + var requestBindings = index.getRequestBindings(operationShape); + var responseBindings = index.getResponseBindings(operationShape); + + // This loop works the same way for request or response bindings. + for (var entry : requestBindings.entrySet()) { + String memberName = entry.getKey(); + HttpBinding binding = entry.getValue(); + System.out.println("Member: " + memberName); + System.out.println("Member shape: " + binding.getMember()); + System.out.println("Location: " + binding.getLocation()); + System.out.println("Location name: " + binding.getLocationName()); + binding.getBindingTrait().ifPresent(trait -> { + System.out.println("Binding trait: " + trait); + }); + } + + +Get the timestamp format used for a specific HTTP binding +--------------------------------------------------------- + +.. code-block:: java + + // Determine the format used for members bound to HTTP labels. + HttpBindingIndex index = HttpBindingIndex.of(model); + var formatUsedInPayloads = TimestampFormatTrait.Format.EPOCH_SECONDS; + var format = index.determineTimestampFormat( + member, HttpBinding.Location.LABEL, formatUsedInPayloads); + + +Get members that have specific HTTP bindings +-------------------------------------------- + +.. code-block:: java + + // Find every member in the input of the operation bound to an HTTP label. + HttpBindingIndex index = HttpBindingIndex.of(model); + var locationTypeToFind = HttpBinding.Location.LABEL; + var result = index.getRequestBindings(operation, locationTypeToFind); + + +.. _codegen-transforming-the-model: + +Transforming the model +====================== + +It's often necessary to transform a Smithy model prior to code +generation. For example, you might need to remove operations that use +unsupported features, remove shapes that aren't in the closure of a +service, or add traits to shapes that are specific to your code +generator. Smithy provides a model transformation abstraction in +``ModelTransformer``. ``ModelTransformer`` provides various method for +transforming a model, some of which are documented below. + + +Remove deprecated operations +---------------------------- + +``ModelTransformer`` will remove any broken relationships when a +shape is removed. If you remove an operation from the model, it's +removed from any service or resource. + +.. code-block:: java + + model = ModelTransformer.create().removeShapesIf(shape -> { + return shape.isOperationShape() && shape.hasTrait(DeprecatedTrait.class); + )}; + + +Add a trait to every shape +-------------------------- + +.. code-block:: java + + model = ModelTransformer.create().mapShapes(shape -> { + return Shape.shapeToBuilder(shape).addTrait(new MyCustomTrait()).build(); + }); + +.. tip:: + + You can convert any shape to a builder using the static method + ``Shape#shapeToBuilder`` + + +.. _codegen-flattening-mixins: + +Flattening mixins +----------------- + +Mixins are used to share shape definitions across a model. They're +essentially build-time copy and paste, and they have no meaningful +impact on generated code. For example, the following model uses mixins: + +.. code-block:: smithy + + @mixin + structure HasUsername { + @required + username: String + } + + structure UserData with [HasUserName] { + isAdmin: Boolean + } + +Code generators should flatten mixins out of a model before generating +code, allowing them to more easily generate code without needing to +implement special handling for mixins. This can be done using a Smithy +model transformation: + +.. code-block:: java + + ModelTransformer transformer = ModelTransformer.create(); + Model transformedModel = transformer.flattenAndRemoveMixins(model); + +After flattening mixins, the above model is equivalent to: + +.. code-block:: smithy + + structure UserData with [HasUserName] { + @required + username: String + + isAdmin: Boolean + } + + +.. _codegen-copying-errors-to-service: + +Copying service errors to operation errors +------------------------------------------ + +Service shapes can define a set of errors that can be returned from any +operation. While this is great for modeling a service, it makes code +generation harder. + +For example: + +.. code-block:: smithy + + service MyService { + operations: GetSomething + errors: [ValidationError] + } + + operation GetSomething { + input := {} + output := {} + } + +Code generators can flatten these errors using a model transformer: + +.. code-block:: java + + ModelTransformer transformer = ModelTransformer.create(); + Model transformed = transformer.copyServiceErrorsToOperations(model, service); + +After flattening the error hierarchy, the above model is equivalent to: + +.. code-block:: smithy + + service MyService { + operations: GetSomething + } + + operation GetSomething { + input := {} + output := {} + errors: [ValidationError] + } + + +Remove shapes not in the closure of a service +--------------------------------------------- + +Smithy models can contain multiple services and shapes that aren't connected +to any service. Code generation is often easier if you remove shapes from the +model that are not connected to the service being generated. + +.. code-block:: java + + Walker walker = new Walker(someModel); + Set closure = walker.walkShapes(someService); + model = ModelTransformer.create().removeShapesIf(shape -> !closure.contains(shape)); + + +Selectors +========= + +Selectors are used to find shapes in the model that match a query. While +you should typically not need selectors when writing Java code, they can +sometimes make getting the desired set of shapes far simpler than +writing complex loops and conditionals. Selectors have similar caveats +as regular expressions: selectors are slower than handwritten code, and +sometimes handwritten code is easier to understand than the DSL. Whether +a selector is appropriate for a given use case will mostly depend on the +complexity of the query and if there's already a built-in abstraction +for what you're trying to do. + + +Creating Selectors +------------------ + +Let's say you want to find something complex, like every operation that +has a ``@stream`` in its input. This can be achieved through the +following selector: + +.. code-block:: java + + Selector selector = Selector.parse("operation :test(-[input]-> structure > member > [trait|streaming])"); + + +Finding shapes that match a selector +------------------------------------ + +``Selector#select`` finds every matching shape and put them in a ``Set``. + +.. code-block:: java + + Set matches = selector.select(model); + + +Iterate over shapes that match a selector +----------------------------------------- + +If the result set does not need to be loaded into memory, then using +``shapes()`` is cheaper than using ``select()``. + +.. code-block:: java + + selector.shapes().forEach(shape -> { + // do something with each shape + }); + + +Reuse parsed ``Selector``\ s +---------------------------- + +Be sure to use a previously parsed selector if a selector will be used +repeatedly. For example don't do this: + +.. code-block:: java + + // ❌ DON'T DO THIS ❌ + + for (var shape : model.getServiceShapes()) { + // This is bad! Reuse Selector instances! + // This has to parse the selector in each iteration of the loop. + Selector selector = Selector.parse(String.format( + "[id=%s] -> structure > member[trait|required]", + shape.getId())); + + selector.shapes(model).forEach(match -> { + // do something with each found shape + }); + } + +Instead, do this: + +.. code-block:: java + + // ✅ DO THIS + + Selector selector = Selector.parse(String.format( + "[id=%s] -> structure > member[trait|required]", + shape.getId())); + + for (var shape : model.getServiceShapes()) { + selector.shapes(model).forEach(match -> { + // do something with each found shape + }); + } diff --git a/docs/source-2.0/guides/index.rst b/docs/source-2.0/guides/index.rst index bf4dd9ed8dc..fb41eb1df5e 100644 --- a/docs/source-2.0/guides/index.rst +++ b/docs/source-2.0/guides/index.rst @@ -12,3 +12,4 @@ Guides converting-to-openapi generating-cloudformation-resources migrating-idl-1-to-2 + codegen/index diff --git a/docs/source-2.0/index.rst b/docs/source-2.0/index.rst index b3f62f609a7..0308b212b75 100644 --- a/docs/source-2.0/index.rst +++ b/docs/source-2.0/index.rst @@ -142,6 +142,7 @@ Read more guides/index Additional specs aws/index + glossary .. toctree:: :caption: Project diff --git a/docs/source-2.0/spec/aggregate-types.rst b/docs/source-2.0/spec/aggregate-types.rst index d8680e7f456..e4ef19897b5 100644 --- a/docs/source-2.0/spec/aggregate-types.rst +++ b/docs/source-2.0/spec/aggregate-types.rst @@ -465,6 +465,8 @@ by ``$``, followed by the member name. For example, the shape ID of the ``i32`` member in the above example is ``smithy.example#MyUnion$i32``. +.. _recursive-shape-definitions: + Recursive shape definitions ===========================