Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The calling convention is a serious wtf. They're relying on store-load forwarding to make the stack as free as a register, but that's iffy at best and changes heavily between microarchitectures.


I'd assert the calling convention is strange by design: there is the underlying reality that, to support actual closures and lambdas, as Go does, in the Lisp sense, not the fake Java sense, one can't use the C calling conventions. In particular, it's not true that a called function can expect to find bindings for its variables on a call stack, because of the upward funargs issue: some bound variables for a called function in the presense of true lambdas and thus closures will necessarily NOT be found on the C call stack, because of the dissociation of scope with liveness in the presence of lambda (anonymous functions).


What you describe is a non-problem: you can trivially spill upvars to the stack on-demand, as most compilers do, while keeping formal parameters in registers. Java needs upvars to be final because it doesn't have the concept of "reference to local variable", but that's just a limitation of the JVM, and one easily solved in other runtimes that very much can pass arguments in registers (e.g. .NET).


The Go developers have considered changing to a register-based calling convention[0][1].

I found these tickets a few weeks ago and they explained why the Go developers haven't yet made this change.

[0] https://github.com/golang/go/issues/18597

[1] https://github.com/golang/go/issues/27539


Interestingly, one of the suggestions to deal with issues in panic backtraces due to this change is to use DWARF.


I'm not familiar with the issue: what makes Java's lambdas/closures fake? Is it that bound variables need to be effectively final?


I don’t know if they’ve done anything new, but as originally implemented, they were inner classes.


The inner class gets copies of the variables, so imperative code that wants to reassign them isn't allowed because it probably won't do what you expected.

The goal is not to GC stack frames. But I'm not sure why the didn't create an inner class to hold the closed-over variables in non-final fields (moving them from the stack to the heap) for both the function and all closures it creates.

(Obligatory "doctor, it hurts when I use mutable state!")


Ah, gotcha. Honestly, I always use this as an example of one of the subtle design points that I really appreciate Java for.

Nitpick, but saying copies in Java can get confusing. Both primitives and references are bound by value. I'm sure you know, but for others: no objects are copied.

I always found this limitation had reassuring regularity; it's the same way arguments are bound to function parameters (minus bring final). Local variables being isolated from "other scopes" means that any interthread communication must be mediated through objects.


They were never implemented that way, rather make use of invokedynamic bytecode.

https://youtu.be/Uns1dm3Laq4

Android Java is the one making use of anonymous inner classes instead.


I believe they still are, with the caveat that the bytecode is built at runtime for lambdas not compile time like regular inner classes.


Invokedynamic is not related at all to inner classes.


Maybe my memory is a little rusty or I glossed over a bit too much, but I was thinking of how hotspot does lambdas from here[0]. It seems to use the Invokedyanmic Bootstrap method to spin an InnerClass at runtime. To be fair, it's a hotspot thing and not in the JVM spec.

[0]: https://github.com/frohoff/jdk8u-jdk/blob/master/src/share/c...


Better check out from Brian's talk.

Not really, because the class file with invokedynamic bytecodes is supposed to work across all JVM implementations.


I think we agree? The bytecode is transferrable because the classfile only contains an invokedynamic that calls the LambdaMetaFactory for bootstrapping. The LambdaMetaFactory is provided by the runtime JVM itself so that linkage dosn't introduce an implementation dependence.

Hotspot's just happens to spin an inner class at runtime.


Yes we agree, I do conceed that I wasn't fully correct.


> Is it that bound variables need to be effectively final?

I believe this is it.


Even with store-load fw, you get a penalty (~3 cycle latency) over register accesses, no?


yeah, but it's cheaper than full L1 hit, which is where it would go if not for that.


I was trying to cite a typical full L1 hit latency... I thought store-load fw simply avoid having to flush the complete write buffer before the access is even possible, which risk to take far more than ~3 cycles. Now maybe it can be faster in some cases than an L1 hit, I don't know.

Edit: it seems that store-load forwarding is actually slightly slower than L1: https://www.agner.org/optimize/blog/read.php?i=854#854


I'm guessing that the reason was simply ease of porting 32-bit x86 assembly code to 64-bit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: