Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Native (Kotlin): improve performance #2

Merged
merged 5 commits into from
Apr 30, 2020
Merged

Native (Kotlin): improve performance #2

merged 5 commits into from
Apr 30, 2020

Conversation

dkhalanskyjb
Copy link
Collaborator

Fixes include:

  • The system timezone is only queried once, after that, it's cached
    for the whole life of the thread; Java does something similar,
    as, according to
    https://docs.oracle.com/javase/7/docs/api/java/util/TimeZone.html#setDefault(java.util.TimeZone),
    the system timezone is only queried at the start of the VM, if
    ever.
  • Most common ZoneOffset instances are now cached, which saves
    a lot of time due to not having to recompute their string
    representation, which is costly in aggregate due to how common
    getting anonymous offsets is for operations on datetimes.
  • Strings are not converted between UTF-16 and UTF-8 when querying
    offsets for a particular timezone (the noStringConversion
    cinterop parameter handles this).
  • Comparators have been rewritten not to use compareBy, which is
    comparatively slow.
  • LocalDate no longer uses Month as its main representation of
    months, as it is an enum, which are surprisingly slow.
  • LocalDate used to calculate many additional fields which may
    never be required but were nonetheless computed for each instance
    of the class. Now, instantiating LocalDate is much cheaper at
    the cost of several additional fields being computed on each call.

After these fixes, on my machine tests run in 25 seconds instead of
56.

@dkhalanskyjb dkhalanskyjb force-pushed the native branch 2 times, most recently from ef0a902 to 54a8d39 Compare April 23, 2020 07:32
require(secondOfDay in 0..SECONDS_PER_DAY)
require(nanoOfSecond in 0..1_000_000_000)
require(secondOfDay >= 0 && secondOfDay <= SECONDS_PER_DAY)
require(nanoOfSecond >= 0 && nanoOfSecond <= 1_000_000_000)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs an issue reference about the missing optimization in Native.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created an issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put a reference to it here, that would explain why the non-idiomatic code is used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

Fixes include:
* The system timezone is only queried once, after that, it's cached
  for the whole life of the thread; Java does something similar,
  as, according to
  https://docs.oracle.com/javase/7/docs/api/java/util/TimeZone.html#setDefault(java.util.TimeZone),
  the system timezone is only queried at the start of the VM, if
  ever.
* Most common `ZoneOffset` instances are now cached, which saves
  a lot of time due to not having to recompute their string
  representation, which is costly in aggregate due to how common
  getting anonymous offsets is for operations on datetimes.
* Strings are not converted between UTF-16 and UTF-8 when querying
  offsets for a particular timezone (the `noStringConversion`
  cinterop parameter handles this).
* Comparators have been rewritten not to use `compareBy`, which is
  comparatively slow.
* `LocalDate` no longer uses `Month` as its main representation of
  months, as it is an enum, which are surprisingly slow.
* `LocalDate` used to calculate many additional fields which may
  never be required but were nonetheless computed for each instance
  of the class. Now, instantiating `LocalDate` is much cheaper at
  the cost of several additional fields being computed on each call.

After these fixes, on my machine tests run in 25 seconds instead of
56.
The tactics employed in this commit are questionable.

The main change is that now, instead of passing strings between
Kotlin and native parts, numeric ids are used. To have a
sensible interpretation of such ids that prohibited their
invalidation, Apple and Windows use additional memory to store the
mapping. Moreover, it is assumed for Apple that the set of
timezones and their rules never change, and for *nix, it is
assumed that the timezone objects are always at the same memory
location, which is true for now but will change if the
implementation switches to use C++20 capabilities.
@dkhalanskyjb dkhalanskyjb force-pushed the native branch 2 times, most recently from 4d8159d to 869bba9 Compare April 28, 2020 08:40
@dkhalanskyjb dkhalanskyjb requested a review from ilya-g April 28, 2020 08:49
Testing revealed that accessing even immutable fields concurrently
is not supported out-of-the-box and instead crashes the Kotlin
Native runtime.
@dkhalanskyjb dkhalanskyjb merged commit ae922df into master Apr 30, 2020
@dkhalanskyjb dkhalanskyjb deleted the native branch April 30, 2020 08:01
dkhalanskyjb added a commit that referenced this pull request Apr 30, 2020
Testing revealed that accessing even immutable global variables
concurrently is not supported out-of-the-box and instead crashes
the Kotlin Native runtime. Thus, `SharedImmutable` annotation was
added to every global variable.

Performance fixes include:
* The system timezone is only queried once, after that, it's cached
  for the whole life of the thread; Java does something similar,
  as, according to
  https://docs.oracle.com/javase/7/docs/api/java/util/TimeZone.html#setDefault(java.util.TimeZone),
  the system timezone is only queried at the start of the VM, if
  ever.
* Instead of passing strings between Kotlin and native parts,
  numeric ids are used. To have a sensible interpretation of such
  ids that prohibited their invalidation, Apple and Windows use
  additional memory to store the mapping. Moreover, it is assumed
  for Apple that the set of timezones and their rules never change,
  and for *nix, it is assumed that the timezone objects are always
  at the same memory location, which is true for now but will
  change if the implementation switches to use C++20 capabilities.
* Most common `ZoneOffset` instances are now cached, which saves
  a lot of time due to not having to recompute their string
  representation, which is costly in aggregate due to how common
  getting anonymous offsets is for operations on datetimes.
* Comparators have been rewritten not to use `compareBy`, which is
  comparatively slow.
* `LocalDate` no longer uses `Month` as its main representation of
  months, as it is an enum, which are surprisingly slow.
* `LocalDate` used to calculate many additional fields which may
  never be required but were nonetheless computed for each instance
  of the class. Now, instantiating `LocalDate` is much cheaper at
  the cost of several additional fields being computed on each call.
* Some Kotlin-style range checks were replaced with the
  corresponding simple comparisons due to a missing optimization
  in Kotlin/Native. See
  #5

After these fixes, on my machine tests run in 22 seconds instead of
56.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants