Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thunk.asFunction0 is broken in Scala 3 nightly #2816

Closed
armanbilge opened this issue Feb 9, 2022 · 24 comments · Fixed by scala/scala3#14648
Closed

Thunk.asFunction0 is broken in Scala 3 nightly #2816

armanbilge opened this issue Feb 9, 2022 · 24 comments · Fixed by scala/scala3#14648
Labels

Comments

@armanbilge
Copy link
Member

This was identified in scala/scala3#14395 (comment), apparently due to scala/scala3#14295.

@armanbilge
Copy link
Member Author

An optimization is applied: If the argument e to a cbn parameter is already
of type () ?=> T and is a pure expression, we avoid (2) and (3), i.e. we
pass e directly instead of () ?=> e.apply().

This reads to me a lot like the optimization that Scala 2 has and that Scala 3 was previously lacking as identified in #2225. But I don't understand why our trick broke.

@djspiewak
Copy link
Member

Yeah I'm perplexed as well. I think this merits some more experimentation. We should try to figure out what's going on so we can give them feedback before they release. At a minimum, the fact that this is detectably different between versions of Scala 3 is frustrating, since it means that our users on one or the other version will have a significantly degraded experience.

@armanbilge
Copy link
Member Author

benchmarks/Jmh/run -wi 10 -i 10 -f 2 -t 1 DeepBindBenchmark.delay --jvmArgs -Dcats.effect.tracing.mode=none

3.1.3-RC1-bin-20220207-74c2954-NIGHTLY

[info] Benchmark                (size)   Mode  Cnt     Score     Error  Units
[info] DeepBindBenchmark.delay   10000  thrpt   20  4058.603 ± 105.582  ops/s

3.1.1

[info] Benchmark                (size)   Mode  Cnt     Score     Error  Units
[info] DeepBindBenchmark.delay   10000  thrpt   20  3712.925 ± 332.421  ops/s

Not sure if we can really read anything into it.

@armanbilge
Copy link
Member Author

At a minimum, the fact that this is detectably different between versions of Scala 3 is frustrating, since it means that our users on one or the other version will have a significantly degraded experience.

Yep, this is what got my attention too. Is there no way to check the Scala 3 version at runtime and use that to choose an implementation?

@armanbilge
Copy link
Member Author

I tried restoring the original Java-based workaround from #2226, but the test still fails.

class ThunkImpl extends Thunk {
@Override
public <A> scala.Function0<A> asFunction0(scala.Function0<A> thunk) {
return thunk;
}
}

So it must be that the thunk is being wrapped earlier than it used to be?

@djspiewak
Copy link
Member

I think we may have to do some bytecode dumpster diving to figure this one out.

@armanbilge
Copy link
Member Author

armanbilge commented Feb 9, 2022

Decompiled Java is about as much as I'll tolerate 😤

"Thunk.asFunction0" should {
  "return the same function" in {
    var i = 0
    val f = () => i += 1
    Thunk.asFunction0(f()) eq f
  }
}

3.0.2

 private static final boolean $init$$$anonfun$2$$anonfun$1() {
    IntRef i = IntRef.create(0);
    Function0 f = () -> {
       int var1 = i.elem + 1;
       i.elem = var1;
    };
    return cats.effect.Thunk..MODULE$.asFunction0(f) == f;
 }

3.1.3-RC1-bin-20220207-74c2954-NIGHTLY

 private static final void $init$$$anonfun$1$$anonfun$1$$anonfun$1(final Function0 f$1) {
    f$1.apply$mcV$sp();
 }

 private static final boolean $init$$$anonfun$1$$anonfun$1() {
    IntRef i = IntRef.create(0);
    Function0 f = () -> {
       int var1 = i.elem + 1;
       i.elem = var1;
    };
    return cats.effect.Thunk..MODULE$.asFunction0(() -> {
       $init$$$anonfun$1$$anonfun$1$$anonfun$1(f);
       return BoxedUnit.UNIT;
    }) == f;
 }

@djspiewak
Copy link
Member

Uh. That looks like a regression honestly. Unless im misreading the above, it would seem that the change is causing unnecessary wrapping of the thunk.

@vasilmkd
Copy link
Member

vasilmkd commented Feb 9, 2022

This should be enough to file an issue upstream, no?

@djspiewak
Copy link
Member

Let's dig into the raw bytecode first, but it is very suggestive of a bug upstream. It honestly looks like the compiler is doing the opposite of what Martin's change was intended to achieve.

@armanbilge
Copy link
Member Author

armanbilge commented Feb 9, 2022

Putting aside the performance regression (which FTR my benchmarks were unable to pick up), I realized that actually this change doesn't break tracing. The relevant test for that continues to pass.

"generate unique traces" in {
val a = IO(println("foo"))
val b = IO(println("bar"))
(a, b) match {
case (IO.Delay(_, eventA), IO.Delay(_, eventB)) => eventA ne eventB
case _ => false
}
}

Even more interesting, actually you could argue the accuracy of cached tracing is improved. This test started to fail, but actually it encodes a less-than-desirable behavior, which is that two independent IOs that happen to call the same method end up sharing the same trace due to caching.

"generate identical traces" in {
val f = () => println("foo")
val a = IO(f())
val b = IO(f())
(a, b) match {
case (IO.Delay(_, eventA), IO.Delay(_, eventB)) => eventA eq eventB
case _ => false
}

Because each of those thunks now generates a unique wrapping lambda, they get unique traces.

@armanbilge
Copy link
Member Author

armanbilge commented Feb 10, 2022

One more observation: what has changed in the compiler is actually the original 2.13->3 regression that caused us to introduce Thunk.asFunction0 in the first place.

If I restore the code to how it used to be:

   def delay[A](thunk: => A): IO[A] = {
-    val fn = Thunk.asFunction0(thunk)
+    val fn = () => thunk
     Delay(fn, Tracing.calculateTracingEvent(fn))
   }

These are the test results:

3.1.3-RC1-bin-20220207-74c2954-NIGHTLY

[info] TracingSpec
[info] IO.delay should
[error]   x generate identical traces
[error]    false (file:1)
[info]   + generate unique traces
[info] Async.delay should
[error]   x generate identical traces
[error]    false (file:1)
[info]   + generate unique traces
[info] Total for specification TracingSpec

3.0.1

[info] TracingSpec
[info] IO.delay should
[info]   + generate identical traces
[error]   x generate unique traces
[error]    false (file:1)
[info] Async.delay should
[info]   + generate identical traces
[error]   x generate unique traces
[error]    false (file:1)
[info] Total for specification TracingSpec

Essentially, there's been a sort of flip-flop.

In the code:

   def delay[A](thunk: => A): IO[A] = {
     val fn = () => thunk
     Delay(fn, Tracing.calculateTracingEvent(fn))
   }

In 3.0.2, the () => thunk was creating a wrapper. Because this wrapper always had the same class, it was causing all thunks to share the same cache key.

In the nightly, the () => thunk is optimized to the Function0 backing the thunk, just like it works in Scala 2.

However, it's the opposite story at the callsite:

var i = 0
val f = () => i += 1
IO.delay(f())

In 3.0.2, the thunk passed to IO.delay was f. This is also how it works in Scala 2.

In the nightly, the f is being wrapped in another Function0 and this is passed in as the thunk. Since the wrapping occurs at the callsite, it is now out of our control. However, it does have the benefit of making every use of IO.delay unique from the perspective of trace caching.

@djspiewak
Copy link
Member

djspiewak commented Feb 10, 2022

In the nightly, the f is being wrapped in another Function0 and this is passed in as the thunk

This is the bit that I think constitutes a regression. Definitely the improvement of the "thunk to () => A" behavior is very very welcome.

Realistically, we're probably going to have to upgrade to the latest Scala 3 in order to force call sites into this particular pattern, otherwise tracing breaks.

@armanbilge
Copy link
Member Author

Realistically, we're probably going to have to upgrade to the latest Scala 3 in order to force call sites into this particular pattern, otherwise tracing break

Why would tracing break?

@djspiewak
Copy link
Member

Why would tracing break?

Doesn't the asFunction0 trick cause the tracing to be misassociated and/or uncached?

@armanbilge
Copy link
Member Author

Hmm. I wouldn't say "misassociated". It does break somewhat the caching, but that actually improves the accuracy as I noted above.

@armanbilge
Copy link
Member Author

FWIW neither @vasilmkd nor I have been able to pick this regression up in benchmarks yet.

@djspiewak
Copy link
Member

FWIW neither @vasilmkd nor I have been able to pick this regression up in benchmarks yet.

I'm pretty sure this is because the JVM inlining is quite good, particularly in those cases. It's possible that it's good in all regressed cases but I'm not entirely convinced that it would be, and at the very least, it seems implausible that more wrapping is better. :-)

Hmm. I wouldn't say "misassociated". It does break somewhat the caching, but that actually improves the accuracy as I noted above.

Oh so the trick is unnecessary on the nightlies, but it doesn't in and of itself break anything? I was concerned that we needed asFunction0 on earlier Scala 3 and needed to not use asFunction0 post-changes.

@armanbilge
Copy link
Member Author

it seems implausible that more wrapping is better.

It's better for accurate cached tracing :P

Oh so the trick is unnecessary on the nightlies, but it doesn't in and of itself break anything? I was concerned that we needed asFunction0 on earlier Scala 3 and needed to not use asFunction0 post-changes.

The trick has been unnecessary in Scala 2 all along and we've been using it anyway. The only reason it would break is if one day thunk is no longer a Function0 behind-the-scenes.

@djspiewak
Copy link
Member

It's better for accurate cached tracing :P

LOL yes, but I'm more worried about general performance.

@djspiewak
Copy link
Member

@armanbilge Do we need to make any adjustments on our end for this? Or is this just suppressing the test?

@armanbilge
Copy link
Member Author

Yes, I think it's just a matter of suppressing tests. We were testing more than we need to establish that tracing is working correctly, and fortunately there is no regression there.

Arguably, the ThunkSpec was actually testing that lambdas passed to delay are not de-optimized, and rightfully failed. This is unrelated to the correctness of tracing but a meaningful regression nonetheless.

@djspiewak
Copy link
Member

@armanbilge I'm going to leave this open to track any adjustments to the unit tests.

@armanbilge
Copy link
Member Author

I just tried 3.2.0-RC1-bin-20220307-6dc591a-NIGHTLY which includes scala/scala3#14628, and both ThunkSpec and TracingSpec are passing without modifications.

Let's get these tests re-enabled in the Scala 3 community build and we can close here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants