Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GR-58477] Reduce the complexity of Native Image reachability metadata #9679

Open
loicottet opened this issue Sep 13, 2024 · 3 comments
Open
Assignees
Labels

Comments

@loicottet
Copy link
Member

loicottet commented Sep 13, 2024

This issue aims to evaluate various methods to reduce the size of the current reachability metadata required for Native Image, specified in reachability-metadata.json files. The tradeoff when reducing metadata size is often an increase in image size, so this evaluation will seek to determine whether the changes proposed below have an acceptable impact on image size for the benefits they bring. To do so, we will look at the impact on the various libraries in the graalvm-reachability-metadata repository and the major microservice frameworks using Native Image.

Here is a summary of the current format of reachability-metadata.json:

{
  "reflection": [
    {
      "type": "reflectively.accessed.Type",
      "fields": [
        {
          "name": "field1"
        }
      ],
      "methods": [
        {
          "name": "method1",
          "parameterTypes": []
        }
      ],
      "allDeclaredConstructors": true,
      "allPublicConstructors": true,
      "allDeclaredFields": true,
      "allPublicFields": true,
      "allDeclaredMethods": true,
      "allPublicMethods": true,
      "unsafeAllocated": true,
    }
  ],
  "jni": [
    {
      "type": "reflectively.accessed.Type",
      "fields": [
        {
          "name": "field1"
        }
      ],
      "methods": [
        {
          "name": "method1",
          "parameterTypes": []
        }
      ],
      "allDeclaredConstructors": true,
      "allPublicConstructors": true,
      "allDeclaredFields": true,
      "allPublicFields": true,
      "allDeclaredMethods": true,
      "allPublicMethods": true,
      "unsafeAllocated": true,
    }
  ],
  "resources": [
    {
      "module": "optional.module.of.a.resources",
      "glob": "path1/level*/**"
    }
  ],
  "bundles": [
    {
      "name": "fully.qualified.bundle.name",
    }
  ],
  "serialization": [
    {
      "type": "serialized.Type",
      "customTargetConstructorClass": "serialized.super.Type"
    }
  ]
}

Include all resources from program JARs

Native Image currently needs to know the URL of every resource potentially accessed at run-time by the program. These resources can come from two sources:

  • Files from the user file system unrelated with the program: these files should not be required in the resource metadata and simply be allowed to be accessed at run-time. This can be achieved at no image size cost.
  • Resources from program JARs: The Native Image builder needs to know these at build-time. Removing these from the reachability metadata means that we include all resources from program JARs in the image. We will evaluate what is the impact of this change on image size.

Include all methods if one method is included

It is already possible to access elements of classes registered for reflection via query methods on Class like getMethods() or getDeclaredField() when the type is registered for reflection (see #3566). However, the Method, Field and Constructor objects obtained in this way do not support reflective access (e.g. Method.invoke() or Field.set()) as these capabilities potentially lead to a big increase in reachable code, and therefore image size. We will however investigate whether including all methods and constructors for reflective invocation if one method or constructor is already registered (i.e. changing method reflection metadata to a single boolean choice) yields acceptable image size increases for the reduction in metadata it allows.

For example, this configuration:

{
  "reflection": [
    {
      "type": "registered.class.Name",
      "methods": [
        {
          "name": "methodName",
          "parameterTypes": []
        },
        ...
      ]
    }
  ]
}

would become equivalent to:

{
  "reflection": [
    {
      "type": "registered.class.Name",
      "allDeclaredMethods": true
    }
  ]
}

thereby removing the need for the methods field.

Evaluation

Testing on the metadata repository and microservice benchmarks yielded the following results:

Image size increase
Reachability metadata repository 2.6% (max. 100%)
Spring Petclinic 14.9%
Quarkus Tika failed build
Micronaut Shopcart failed build

This is a consequent image size increase in most cases and will not be suitable to be the default setting of Native Image. Furthermore, making new code reachable can make image builds fail, and would require some adjustments by the affected developers.

Enable all fields to be reflectively accessed by default

Whereas including methods for reflection can have an important impact on reachability, including fields is typically less of an issue. We will investigate whether enabling reflective access for fields of registered types and removing the "all[Public|Declared]Fields" fields in the reachability metadata JSON files causes a significant image size increase.

In this case, the "fields", "allDeclaredFields" and "allPublicFields" fields would be made obsolete, simplifying the following metadata:

{
  "reflection": [
    {
      "name": "registered.class.Name",
      "allPublicFields": true,
      "fields": [
        {
          "name": "fieldName"
        }
      ]
    }
  ]
}

into :

{
  "reflection": [
    {
      "name": "registered.class.Name"
    }
  ]
}

Evaluation

Testing on the metadata repository and microservice benchmarks yielded the following results:

Image size increase
Reachability metadata repository 0.3% (max. 2.3%)
Spring Petclinic 0.5%
Quarkus Tika 8.9%
Micronaut Shopcart 0.5%

The image size increase looks limited in most cases, but Quarkus shows a significant increase. This looks like a good candidate for metadata simplification, but an opt-out should be maintained.

Consider all reflectively accessed types as unsafe allocated, accessed through JNI, and/or serializable

We will evaluate whether removing the "unsafeAllocated" JSON field or the "jni" or "serializable" objects in reachability metadata significantly affects image size which, along with the resource changes proposed above and the potential integration of resource bundles in the reflection metadata, could reduce reachability metadata to a single set of types, thus greatly simplifying it.

A metadata file as complex as this one:

{
  "reflection": [
    {
      "name": "registered.class.Name",
      "unsafeAllocated": true
    }
  ],
  "jni": [
    {
      "name": "registered.class.Name"
    }
  ],
  "serialization": [
    {
      "name": "registered.class.Name"
    }
  ]
}

could become as simple as:

{
  "reflection": [
    {
      "name": "registered.class.Name"
    }
  ]
}

Evaluation

Testing on the metadata repository and microservice benchmarks yielded the following results:

All types registered as unsafe allocated

Image size increase
Reachability metadata repository 0.4% (max. 3.2%)
Spring Petclinic 1.7%
Quarkus Tika 0.1%
Micronaut Shopcart 0.0%

The image size increase looks reasonable, this looks like a good candidate for metadata simplification.

All types registered for JNI

Image size increase
Reachability metadata repository 0.7% (max. 5.6%)
Spring Petclinic 7.1%
Quarkus Tika 3.6%
Micronaut Shopcart 0.0%

The image size increase is consequent. All reflectively accessed types should probably not be automatically registered for JNI by default. However, adding a "jniAccessible" field to the reflection metadata could be a good way of reducing the metadata size without impacting image size too much.

All types registered for serialization

Image size increase
Reachability metadata repository 1.1% (max. 13.5%)
Spring Petclinic 4.9%
Quarkus Tika build failed
Micronaut Shopcart 0.7%

Similarly to JNI, registering all reflectively-accessed types for serialization is a big increase in image size. There is also the added risk of making code reachable which breaks the image build. A "serializable" field in the reflection configuration would be an interesting addition as well.

Always register all super constructors of serializable types

Serialization metadata currently requires a custom super constructor class to be specified if a non-default constructor is expected to be used. We will evaluate the impact of including all super constructors from serializable classes, thus removing this requirement. This would get rid of the "customConstructorClass" field in the serialization metadata

Evaluation

Image size increase
Reachability metadata repository 0.2% (max. 0.4%)
Spring Petclinic 0.0%
Quarkus Tika 0.0%
Micronaut Shopcart 0.0%

The image size increase caused by this change is minimal. This field should be removed and all constructors registered by default.

Combined testing

We also evaluated the combined effect of the three changes causing minimal image size impact: registering all fields of reflectively-accessible types as accessed, registering those types as unsafe accessed by default, and registering all super constructors of serializable classes as potential custom serialization constructors. The results are as follows:

Image size increase
Reachability metadata repository 0.5% (max. 4.6%)
Spring Petclinic 2.2%
Quarkus Tika 8.9%
Micronaut Shopcart 0.5%

All in all, those results are reasonable and those three changes can be implemented in Native Image, with the maintenance of an opt-out for automatic field inclusion, as explained in the corresponding section above.

Screenshot 2024-11-05 at 14 23 30

Breakdown of the metadata repository results

@loicottet loicottet self-assigned this Sep 13, 2024
@vjovanov vjovanov changed the title Evaluate the impact of reducing the size of Native Image reachability metadata Evaluate the impact of reducing the complexity of Native Image reachability metadata Sep 13, 2024
@vjovanov vjovanov removed this from Native Image Sep 13, 2024
@vjovanov vjovanov changed the title Evaluate the impact of reducing the complexity of Native Image reachability metadata Reducing the complexity of Native Image reachability metadata Sep 13, 2024
@loicottet loicottet moved this to In Progress in GraalVM Community Roadmap Oct 1, 2024
@sdeleuze
Copy link
Collaborator

sdeleuze commented Oct 1, 2024

My feedback from Spring POV:

  • Include all resources from program JARs: we tried that and it does not work well in practice without a big list of exception because some libraries ship with megabytes of resources that are not always useful. I have a related question: do resources included in the native image only impact the image size or also the memory footprint when they are not loaded?
  • Include all methods if one method is included: the mental model is not obvious, could be confusing, and I think the transitive side effects will be too bad in terms of footprint, I am I think strongly against such change.
  • Enable all fields to be reflectively accessed by default: good idea, low impact, please do it!
  • Consider all reflectively accessed types as unsafe allocated, accessed through JNI, and/or serializable: maybe if the cost is not too high. For JNI, maybe you should consider how Foreign Function Interface (FFI) works with native and target that as the middle/long terms native interop mechanism.
  • Always register all super constructors of serializable types: good idea, low impact, please do it!
  • Spring is using https://github.com/spring-projects/spring-framework/blob/main/spring-core/src/main/java/org/springframework/aot/nativex/feature/PreComputeFieldFeature.java for most classpath checks, could we get such kind of feature officially supported via an upcoming version of metadata?

Also we care more about the memory consumption than the image size, so please measure that on typical apps (empty WebMVC app + empty WebFLux app + Petclinic).

@zakkak
Copy link
Collaborator

zakkak commented Oct 2, 2024

Some comments from the Quarkus POV:

  • Include all resources from program JARs:
    • Register Files from the user file system unrelated with the program: Yes please. This should have zero impact on size and performance while removing the need to register all potential filesystem paths as resources to avoid getting MissingRegistrationErrors.
    • Resources from program JARs: I agree with Sebastien, seems too aggressive and can have huge impact depending on the JAR file, e.g. JARs shipping with native libraries for multiple platforms.
  • Include all methods if one method is included: I agree with Sebastien that the transitive effects of this can be bad. As far as I understand registered methods are treated as roots (i.e. always reachable), which means that anything reachable from them becomes also reachable, which is often undesirable both in terms of image size as well as in terms of bringing in optional dependencies.
  • Enable all fields to be reflectively accessed by default: The maximum size impact seems high. I am also concerned about requiring some optional dependency to be brought in due to the field's type.
  • Consider all reflectively accessed types as unsafe allocated, accessed through JNI, and/or serializable: The maximum size impact (3%) on treating all types as unsafe allocated seems quite high.
  • Always register all super constructors of serializable types: Makes sense.

In general all the suggestions make sense as defaults since they lower the entry barrier, even if some of them increase the image size, so I believe it would be good to implement them while keeping the current syntax support so that users/libraries/frameworks can still opt-out of these new defaults and perform further optimizations.

@vjovanov vjovanov changed the title Reducing the complexity of Native Image reachability metadata [GR-58477] Reducing the complexity of Native Image reachability metadata Oct 3, 2024
@vjovanov vjovanov changed the title [GR-58477] Reducing the complexity of Native Image reachability metadata [GR-58477] Reduce the complexity of Native Image reachability metadata Oct 3, 2024
@loicottet
Copy link
Member Author

#10178 is the first PR to implement those changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: In Progress
Development

No branches or pull requests

6 participants