Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That still doesn't make sense to me.. how can JIT in general be slower than Java when Java is JITed?


The peer comment about static typing is correct, but there's more, err, flavor to be enjoyed in the challenges of JITing JS. Here's some deopt scenarios to keep you up at night.

To whet your appetite, what happens if you redefine Math.round? Java prohibits this for obvious reasons, but a JS joker may write:

    Math.round = () => Infinity;
JS engines really do inline Math.round, and must be prepared for this nonsense.

It gets worse. Maybe you check for a property which is usually not found:

    if (!val.uuid)
       val.uuid = uuidgen();
Hidden classes can almost reduce this to a pointer comparison. But what if someone adds a uuid property on Object.prototype? Every check is busted! v8 handles this with "validity cells", and it's ugly, requiring that every object know about every object which "inherits" from it.

Now if you are a monster you may choose to write:

    Object.defineProperty(Array.prototype, "42", {value: "lol"});
    console.log([][42]); // yes it prints "lol"
Every array gets a default value for 42. Think about how you would JIT numeric code in such an environment...


Python suffers the same issue. There's some discussion about dealing with it in this PEP and the pages it links: https://www.python.org/dev/peps/pep-0509/


Python does not seem to be even trying to be competitive in performance though so it's less of an issue for them.


They do, it's just not a priority. The PEP I linked (as well as most of the work of its author, Victor Stinner, in the last years) are motivated by JITing.

There's also the "frame evaluation API" PEP [1], whose purpose is to allow pluggable evaluators in CPython without forking the entire interpreter, like Unladen Swallow had to.

[1]: https://www.python.org/dev/peps/pep-0523/


Python is orders of magnitude slower than JS, though. It has the same problems, but doesn't even try solving most. PyPy would be a better VM to compare against.


Java deals with almost exactly the same issue. If you define an interface and a single implementing class, the code will be compiled to always call that classes method, and the deoptimized / recompiled if you load another class that implements that interface. JS could deal with these issues in a similar manner.


You can redefine Math.round. It's an ordinary method from rt.jar. You can change rt.jar or you can redefine it with java agent.

Not as straightforward as with JS, but Java JIT still have to account for those things.

May be it will assume some things about standard classes for optimization, but for user classes that will be true anyway.


You can really redefine an existing Java method at runtime? Well TIL!


The Java JIT has static type information to work with, the JS JIT can only infer type information via heroic efforts. Static types do mean something, especially when working with primitives and other unboxed data types.


This. Javascript engines have to do stuff like https://mathiasbynens.be/notes/shapes-ics to assume many objects will have the same key names and types and create an optimized class layout, with some kind of expensive fallback if anyone ever randomly stores a key or value that violates the assumption. They can't even be sure you'll always be doing integer math (unless you resort to bit masking a la asm.js). Hinting (like __slots__ in Python) might have helped some.


This was one of the main motivations for writing dart. The V8 implementers were tired of having to write these kinds of hacks. Even though dart was initially a dynamically typed language, the shape of the objects was stable.


> The Java JIT has static type information to work with

I don't know that the java JIT has type feedback. However regardless Java being statically typed means Java code is structurally closer to what a JIT wants and can analyse, it's much more difficult (and thus rarer) to fuck around with types and objects generated on the fly at runtime for instance, you're not going to add new attributes to an instance whereas that's just tuesday in javascript.


Java JITs definitely do have type feedback. They profile the types of objects at virtual call sites and do guarded inlining, as well as removing expensive interface casts, etc. The JVM doesn't really have a distinction between classes generated on the fly and those that were on disk, as they all go through a classloader anyway. JVMs, for the most part, blow their brains out at the end of the day and start with zero the next VM run.


Well, you're right that JS JITs can only infer type information via heroic efforts, but AFAIK the Java compiler throws away any type information from the source code, which means that the JVM JIT needs to inver type information again from the byte code.

Still, the JVM JIT is faster than JS due to reasons explained in the sibling comment of the parent one.


Java is only throwing generic type information away in the declaration of classes and methods. The code that is accessing generic types is compiled with casts to the expected type. List<String>.get(1) will generate a cast to String therefore nothing is actually missing in the generated code. It is only missing when you use reflection to e.g. deserialize a List via Jackson by passing List<String>.class. That unfortunately won't work because the generic type parameter is not part of the declaration, only the generated code.


> but AFAIK the Java compiler throws away any type information from the source code

You're thinking about generics. .class files preserve a whole bunch of type information (I'm building a .class decompiler in my free time, and I'm looking at that very same data in my debugger ATM).


You probably know this, but it's not obvious to the random reader.

Even generic types are available in the Java .class file and are accessible from the reflection API. Spring for example uses this quite heavily.


It depends on where you are at.

Type information is present in fields and class inheritance

For example, a class like this

`class Foo implements Bar<String>`

Retains the fact that the generic type is a String.

That information is completely lost at method invocation. So a method that takes a `Bar<String>` ultimately compiles to a method that takes a `Bar` and knows nothing of the String.

To get that generic information down you have to engage in some fun tricks using either the class or field method I mentioned earlier. (Usually you do this with a second type parameter where it matters).


Java JITs don’t throw away type information. The bytecode carries strong types, though you have to do some work to get them out (some abstract interpretation). But that’s something you can do statically; no need for profiling or speculation.

In JS you can only get the types by profiling.


Generics are erased in Java. Not so in the CLR.


Yeah but the thread is about types, not generic types specifically.

Also it’s not really true that genetics are erased. There’s that horrible thing javac does so that the VM can support reflection for generics. I get your point though.


Do you mean the passing of classes as additional arguments? I haven't seen the horrible thing.

Overall I think C# got this one right. Generics are right there in the binaries.

bUt ActuALLy, I think the rightest right thing is to do specialization up to representation at link time (or compile time), when the whole program is available, ala MLton. Virgil does this. Of course this is not possible in a dynamic code loading environment but I only have so many fucks to give in this life.


No, I meant that the Class format has metadata about what the genetics were so some java.lang.reflect thing can query about what the genetic type was. I don’t think it comes with soundness guarantees.

I agree C# got this right.

I agree that doing specialization up to link time is ideal from a certain standpoint.


All generics in Java are of type Object from the start, you can't call any methods on them not implemented by Object. So it is wrong to say they are erased, they were never there to start with.


You can add a constraint to a generic declaration, which is an upper (or lower) bound on the allowable type arguments.

e.g.

  class A<Y extends X> {
    // in this scope, Y is known to be of at least type X,
    // so, we can call methods on expressions of type Y
    // that belong to type X (and not Object)
    Y m() { ... }
  }
Type arguments are omitted from usage sites. The technique is literally called erasure in papers and documentation.

  a = new A<Foo>();
  f = a.m(); // should return a Foo, in bytecode returns Y
             // and compiler inserts a cast from Y -> Foo
Generic code is slower in Java because of these extra casts. To get back (most of) the performance, the JVM has to inline enough methods to be able to track the types from start to finish. It can't always.


And that's why Java is slower than it could be. If I'm using String[] array, it can't contain anything but String objects, so JVM does not have to check type everytime I'm accessing that array. That's not true for ArrayList<String>, where compiler must check returned object type from `get(index)` (because it really is Object and can contain anything), but it could be true with better language design.


This is not true, bytecode stores fields and classes as fully qualified strings: https://stackoverflow.com/a/17406592


Yeah what you said.


You're not wrong, but I always wonder how the verbosity and crazy class hierarchy of Java code doesn't make things slower.

I know that compilers are smart, but the API doesn't make things easier at all


The Java standard library isn't particularly verbose or insane. It's the "enterprise java" world that deals with the sort of excesses you're thinking about.


Verbosity makes the programmer slower not the program.


It depends. For example, you can verbosely write a loop to copy data from one array to another, or you can use memcpy. The later is typically implemented as hand optimised assembly - it's a pretty small number of operations in x86_64. The former - maybe the compiler optimizes, maybe it doesn't. If it does optimize it, it's definitely more complexity in the compiler and slows it down.

In general having some higher level, well optimized helpers can certainly reduce verbosity and increase speed. That said, some types of verbosity just make the programmer write what the compiler would translate to anyway. Or can end up being unnecessary - e.g. a strongly typed language with no inference certain slows the programmer with no effect on the program (though maybe an effect on the type checker).


A fast memcpy used to be complicated to implement but these days I heard a simple

    rep movsb
will do the trick. This assumes that the source register points to source, destination register points to destination and counter register is set to number of bytes to copy.


I wish that was true but my data says it isn't:

- A verbose SIMD copy loop is often faster even on modern Intel CPUs that have the more modern rep implementation.

- A simple byte copy loop will beat both SIMD and rep when you're copying smaller amounts at a time.


Yes I understand some verbosity is structural and won't change anything, for example

SomeClass x = new SomeClass();

won't make anything slower (ignoring auto for now)

Now, I read Java code and I can't help but wonder how things are harder

For example, setters and getters, what would be a simple memory write becomes a function call. Not complaining when you actually need it.

Reading the examples here (and those are not too bad) https://developer.android.com/reference/java/net/HttpURLConn... it seems you have to actually fight the API to get anything done

Why do you need to cast the return of a url.openConnection() to a HttpURLConnection? (I mean, how many connection types exist?)


So for getters and setters a Jit can inline the function call and in the end you have a direct memory access. In your example you have to cast because URLConnection could be a HttpURLConnection or JarURLConnection. But again a good Jit would speculate that it is always a HttpURLConnection and deoptimize if not.


The JVM also has full information about the class hierarchy at runtime. If you add a class that overrides a function it will deoptimize that overriden function to a virtual function but otherwise the JVM will just treat it as a static function and inline it if necessary.


In modern Java I don't see why you would directly use an HttpURLConnection. The stdlib provides an easy to use HttpClient.

https://docs.oracle.com/en/java/javase/11/docs/api/java.net....


> I mean, how many connection types exist?

Two? https://developer.android.com/reference/java/net/URLConnecti...


The design of JavaScript is not very friendly to high performance. Even though Java is itself not easy to generate efficient code from, producing efficient code from JavaScript is far more difficult.


The claim wasn’t about JIT in general, but rather about JS.


Then it’s problematic to be so JS and V8 centric. Also the post specifically talked about JS being faster than Python and slower than Java so that’s what this thread is about.


I’m not advocating for the post or anything else, I was just clarifying a misunderstanding.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: