Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The downside of v8 isolates is: you have to reinvent a whole bunch of stuff to get good isolation (both security and of resources).

Here's an example. Under no circumstances should CloudFlare or anyone else be running multiple isolates in the same OS process. They need to be sandboxed in isolated processes. Chrome sandboxes them in isolated processes.

Process isolation is slightly heavier weight (though forking is wicked fast) but more secure. Processes give you the advantage of using cgroups to restrict resources, namespaces to limit network access, etc.

My understanding is that this is exactly what Deno Deploy does (https://deno.com/deploy).

Once you've forked a process, though, you're not far off from just running something like Firecracker. This is both true and intense bias on my part. I work on https://fly.io, we use Firecracker. We started with v8 and decided it was wrong. So obviously I would be saying this.

Firecracker has the benefit of hardware virtualization. It's pretty dang fast. The downside is, you need to run on bare metal to take advantage of it.

My guess is that this is all going to converge. v8 isolates will someday run in isolated processes that can take advantage of hardware virtualization. They already _should_ run in isolated processes that take advantage of OS level sandboxing.

At the same time, people using Firecracker (like us!) will be able to optimize away cold starts, keep memory usage small, etc.

The natural end state is to run your v8 isolates or wasm runtimes in a lightweight VM.



The future of compute is fine-grained. Cloudflare Workers is all about fine-grained compute, that is, splitting compute into small chunks -- a single HTTP request, rather than a single server instance. This is what allows us to run every customer's code (no matter how little traffic they get) in hundreds of locations around the world, at a price accessible to everyone.

The finer-grained your compute gets, the higher the overhead of strict process isolation gets. At the point Cloudflare is operating at, we've measured that imposing strict process isolation would mean an order of magnitude more overhead, in terms of CPU and memory usage. It depends a bit on the workload of course, but it's big. Yes, this is with all the tricks, zygote processes, etc.

We have plenty of defense-in-depth that we can do instead of process isolation, that doesn't have such enormous cost. [0] [1]

IMO, the platforms that stubbornly insist that process isolation is the only answer are the ones that are going to lose out eventually, just as bare metal has been supplanted by VMs, which are in turn being replaced by containers, etc. Over time we move to finer-grained primitives because doing so unlocks more power.

[0] https://blog.cloudflare.com/mitigating-spectre-and-other-sec...

[1] https://blog.cloudflare.com/spectre-research-with-tu-graz/


Point of order: containers aren't a mechanism to increase compute granularity; they're an abstraction designed to make compute easier to package and deploy. Containers can be bin-packed m:n into VMs or machines, but that's just a detail; over time, containers are all going to end up VM-scheduled, as VMs get cheaper and cheaper.

Meanwhile, the multitenant-containers/jails/zones people conclusively had the wrong side of the argument, despite how granular they were; multitenant shared-kernel is unsafe.

I have no opinion about whether language runtime isolation is competitively safe with virtualization. It's probably situational. I just object to the simple linear progression you're presenting.


Containers _are_ a mechanism to increase to compute granularity. Some of our workloads were having trouble scaling to 128 cores, and by using containers, we can have more of them running with fewer cpu's. Furthermore, it's straightforward to support providing burst capacity to applications running within a container, given all the OS needs to do is to give those cgroups some extra CPU time for a limited time.


> Containers _are_ a mechanism to increase to compute granularity. Some of our workloads were having trouble scaling to 128 cores, and by using containers, we can have more of them running with fewer cpu's.

As opposed to having VMs? Otherwise, I don't see how containers can increase granularity.


As opposed to the original comment that says they aren’t useful for increasing granularity.


It sounds like containers enabled you to manage the deployment of multiple processes on a single machine more easily which doesn't contradict the comment you're replying to. Would it not have been possible without containers?


> containers aren't a mechanism to increase compute granularity

Yes they are. Instead of thinking in terms of a whole running operating system with dozens of services, now you are thinking in terms of individual (micro?)services that are relatively isolated from each other. We stuff a lot more containers per box than we used to stuff VMs per box.

But it's true containers (of the namespace/cgroup/seccomp variety) have failed to be a sufficiently secure isolation mechanism to use them for multi-tenant scenarios, so instead we mostly pack containers from the same owner together.

I'd sort of argue that Firecracker and gVisor are actually container engines that happen to use CPU features meant for hardware VMs for additional security hardening. The granularity of compute that you put in them is more container-ish than traditional VM-ish.


They are, or should be, entirely self contained such that whatever segregation is employed - be it hardware via a VM or in kernel with apparmor or SELinux provides sufficient segregation for the work load. V8s problem is JavaScript and NPM, but limiting the blast radius with hardware virtualisation is a win for segregation and v8 will win, at least for front end, because it’s got the mindset. As long as the library ecosystem cleans up.


"In kernel with apparmor or SELinux" can't possibly provide sufficient workload isolation, because it implies workloads share a kernel. It's easy to rattle off relatively recent kernel LPEs that no mandatory access control configuration would have prevented.

The Linux kernel simply wasn't designed to provide the kind of isolation "naive" containers want them to. Actually, generalize that out: Unix kernels in general weren't designed this way. It just doesn't work.


End game then is LittleKernel/Zircon on Fly.io? When do we get to play with those?


Whether software based access control is sufficient depends on the workload and where in the stack the workload runs. I agree though, hardware virtualisation based is more secure and less complex. It also requires access to bare metal, so a providers service or run it yourself, which is a trade off.


> over time, containers are all going to end up VM-scheduled, as VMs get cheaper and cheaper

That kind of started with Github actions.


This makes me really wish I was in a position to spend time trying to earn CloudFlare bug bounties. ;)

Sure, there are massive advantages to using finer-grained sandboxes, but that doesn’t mean it’s safe.


> The future of compute is fine-grained. Cloudflare Workers is all about fine-grained compute, that is, splitting compute into small chunks -- a single HTTP request, rather than a single server instance. This is what allows us to run every customer's code (no matter how little traffic they get) in hundreds of locations around the world, at a price accessible to everyone.

I don't think this holds any truth at all.

Cloudflare Workers were designed to be so "fine-grained" because their whole rationale is to have very small compute steps at each request to do a very small touch-up at each request/response, With negligible or tolerable performance impact.

This is not a major paradigm change. It's a request handler placed on edge servers to do a minor touch up without the client noticing. Conceptually they are the same as a plain old Controller from Spring or Express. They just have tighter constraints because they run on resource-constrained hardware and handle performance-constrained requests. Other than this, they are a plain old request handler.


Considering you're engaging the tech lead of the tech in question, it's intriguing what you mean by this. Is it that kentonv is lying or that they're mistaken, or something else?


To be clear, I'm the tech lead of Cloudflare Workers, and wrote the core runtime implementation. Sorry, I should have stated that more clearly above.

While minor request/response rewrites were an early target use case, the platform is very much intended to support building whole applications running entirely on Workers. We do think this is a major paradigm shift.


I like CloudFlare - but still haven't heard back since signing up for Workers for Platforms the day it was announced


I love CloudFlare workers EXCEPT the ridiculous limit they place on outgoing web requests. It makes doing something like writing a distributed web scraper or crawler impractical, so poo poo on them.


> It makes doing something like writing a distributed web scraper or crawler impractical

That's why they have the limits.


Never thought I’d see a self-aware wolf in a technical discussion.


I don't think that's what CloudFlare workers are for.


> Under no circumstances should CloudFlare or anyone else be running multiple isolates in the same OS process.

That depends on your scenario. In our case, all the JavaScript code is ours so we're not worried about it trying to exploit bugs and escape into native. Running multiple Isolates / Contexts gives us isolation on the JS side but also lots of sharing (the v8::Platform object and several other globals are shared in the process).

Of course, if you're running untrusted JavaScript code what you're saying makes sense (though I wouldn't go as far as Firecracker, low-rights sandbox processes a-la Chrome do the job).


My understanding is that you are worried about code other customers are running (since both of you are running in the same process); but I might be wrong/not getting the angle of all of this?


Yes, fair. That makes complete sense.


> Process isolation is slightly heavier weight (though forking is wicked fast) but more secure. Processes give you the advantage of using cgroups to restrict resources, namespaces to limit network access, etc. My understanding is that this is exactly what Deno Deploy does.

Interestingly, as does Android: https://stackoverflow.com/a/12703292


> Under no circumstances should CloudFlare or anyone else be running multiple isolates in the same OS process. They need to be sandboxed in isolated processes. Chrome sandboxes them in isolated processes.

Is it because V8 isolates rely on the process sandbox or just to have a double sandbox?

from https://blog.cloudflare.com/cloud-computing-without-containe...

> Isolates are lightweight contexts that group variables with the code allowed to mutate them. Most importantly, a single process can run hundreds or thousands of Isolates, seamlessly switching between them.

Cloudflare runs multiple isolates per process.

> We also have a few layers of security built on our end, including various protections against timing attacks, but V8 is the real wonder that makes this compute model possible

They also talk about how they removed many browser APIs for security, but seems to heavily rely on V8 isolates for sandboxing.


CloudFlare actually makes the case for running in dedicated processes on their own blog: https://blog.cloudflare.com/mitigating-spectre-and-other-sec...

Running untrusted code in the same process gives that code a tremendous blast radius if they exploit a vulnerability in, say, a fetch implementation. I do not understand why they would do this.

Isolating processes adds a layer of protection. People who exploit your implementation have limited access to the system (they can't read another user's memory, for example, which often contains sensitive info – like private keys).

KVM adds _another_ layer.

If you have a process running in a namespace within a KVM, someone would need to exploit the process, the Linux Kernel, and the underlying virtualization extensions to do serious damage.


The process, the Linux kernel, underlying virtualization extensions (maybe; not totally following that one) and the mandatory access control rules applied to the VM runtime --- in Firecracker's case, the BPF jail it runs in.


Belt, suspenders, antigravity device, as Steve Summit has been known to say.


> Under no circumstances should CloudFlare or anyone else be running multiple isolates in the same OS process.

TFA: "Most importantly, a single process can run hundreds or thousands of Isolates, seamlessly switching between them." :)


And that's true as long as all the isolates trust each other.


This is such a great answer, I just wanted to send a +1 in agreement about isolates eventually being able to leverage cpu virt. Much nodding happened while reading your answer.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: