The downside of v8 isolates is: you have to reinvent a whole bunch of stuff to g...

kentonv · on June 15, 2022

The future of compute is fine-grained. Cloudflare Workers is all about fine-grained compute, that is, splitting compute into small chunks -- a single HTTP request, rather than a single server instance. This is what allows us to run every customer's code (no matter how little traffic they get) in hundreds of locations around the world, at a price accessible to everyone.

The finer-grained your compute gets, the higher the overhead of strict process isolation gets. At the point Cloudflare is operating at, we've measured that imposing strict process isolation would mean an order of magnitude more overhead, in terms of CPU and memory usage. It depends a bit on the workload of course, but it's big. Yes, this is with all the tricks, zygote processes, etc.

We have plenty of defense-in-depth that we can do instead of process isolation, that doesn't have such enormous cost. [0] [1]

IMO, the platforms that stubbornly insist that process isolation is the only answer are the ones that are going to lose out eventually, just as bare metal has been supplanted by VMs, which are in turn being replaced by containers, etc. Over time we move to finer-grained primitives because doing so unlocks more power.

[0] https://blog.cloudflare.com/mitigating-spectre-and-other-sec...

[1] https://blog.cloudflare.com/spectre-research-with-tu-graz/

tptacek · on June 15, 2022

Point of order: containers aren't a mechanism to increase compute granularity; they're an abstraction designed to make compute easier to package and deploy. Containers can be bin-packed m:n into VMs or machines, but that's just a detail; over time, containers are all going to end up VM-scheduled, as VMs get cheaper and cheaper.

Meanwhile, the multitenant-containers/jails/zones people conclusively had the wrong side of the argument, despite how granular they were; multitenant shared-kernel is unsafe.

I have no opinion about whether language runtime isolation is competitively safe with virtualization. It's probably situational. I just object to the simple linear progression you're presenting.

ec109685 · on June 16, 2022

Containers _are_ a mechanism to increase to compute granularity. Some of our workloads were having trouble scaling to 128 cores, and by using containers, we can have more of them running with fewer cpu's. Furthermore, it's straightforward to support providing burst capacity to applications running within a container, given all the OS needs to do is to give those cgroups some extra CPU time for a limited time.

csomar · on June 16, 2022

> Containers _are_ a mechanism to increase to compute granularity. Some of our workloads were having trouble scaling to 128 cores, and by using containers, we can have more of them running with fewer cpu's.

As opposed to having VMs? Otherwise, I don't see how containers can increase granularity.

ec109685 · on June 18, 2022

As opposed to the original comment that says they aren’t useful for increasing granularity.

infogulch · on June 17, 2022

It sounds like containers enabled you to manage the deployment of multiple processes on a single machine more easily which doesn't contradict the comment you're replying to. Would it not have been possible without containers?

kentonv · on June 16, 2022

> containers aren't a mechanism to increase compute granularity

Yes they are. Instead of thinking in terms of a whole running operating system with dozens of services, now you are thinking in terms of individual (micro?)services that are relatively isolated from each other. We stuff a lot more containers per box than we used to stuff VMs per box.

But it's true containers (of the namespace/cgroup/seccomp variety) have failed to be a sufficiently secure isolation mechanism to use them for multi-tenant scenarios, so instead we mostly pack containers from the same owner together.

I'd sort of argue that Firecracker and gVisor are actually container engines that happen to use CPU features meant for hardware VMs for additional security hardening. The granularity of compute that you put in them is more container-ish than traditional VM-ish.

willsher · on June 15, 2022

They are, or should be, entirely self contained such that whatever segregation is employed - be it hardware via a VM or in kernel with apparmor or SELinux provides sufficient segregation for the work load. V8s problem is JavaScript and NPM, but limiting the blast radius with hardware virtualisation is a win for segregation and v8 will win, at least for front end, because it’s got the mindset. As long as the library ecosystem cleans up.

tptacek · on June 16, 2022

"In kernel with apparmor or SELinux" can't possibly provide sufficient workload isolation, because it implies workloads share a kernel. It's easy to rattle off relatively recent kernel LPEs that no mandatory access control configuration would have prevented.

The Linux kernel simply wasn't designed to provide the kind of isolation "naive" containers want them to. Actually, generalize that out: Unix kernels in general weren't designed this way. It just doesn't work.

ignoramous · on June 16, 2022

End game then is LittleKernel/Zircon on Fly.io? When do we get to play with those?

willsher · on June 16, 2022

Whether software based access control is sufficient depends on the workload and where in the stack the workload runs. I agree though, hardware virtualisation based is more secure and less complex. It also requires access to bare metal, so a providers service or run it yourself, which is a trade off.

csomar · on June 16, 2022

> over time, containers are all going to end up VM-scheduled, as VMs get cheaper and cheaper

That kind of started with Github actions.

comex · on June 15, 2022

This makes me really wish I was in a position to spend time trying to earn CloudFlare bug bounties. ;)

Sure, there are massive advantages to using finer-grained sandboxes, but that doesn’t mean it’s safe.

arinlen · on June 16, 2022

> The future of compute is fine-grained. Cloudflare Workers is all about fine-grained compute, that is, splitting compute into small chunks -- a single HTTP request, rather than a single server instance. This is what allows us to run every customer's code (no matter how little traffic they get) in hundreds of locations around the world, at a price accessible to everyone.

I don't think this holds any truth at all.

Cloudflare Workers were designed to be so "fine-grained" because their whole rationale is to have very small compute steps at each request to do a very small touch-up at each request/response, With negligible or tolerable performance impact.

This is not a major paradigm change. It's a request handler placed on edge servers to do a minor touch up without the client noticing. Conceptually they are the same as a plain old Controller from Spring or Express. They just have tighter constraints because they run on resource-constrained hardware and handle performance-constrained requests. Other than this, they are a plain old request handler.

yesbabyyes · on June 16, 2022

Considering you're engaging the tech lead of the tech in question, it's intriguing what you mean by this. Is it that kentonv is lying or that they're mistaken, or something else?

kentonv · on June 16, 2022

To be clear, I'm the tech lead of Cloudflare Workers, and wrote the core runtime implementation. Sorry, I should have stated that more clearly above.

While minor request/response rewrites were an early target use case, the platform is very much intended to support building whole applications running entirely on Workers. We do think this is a major paradigm shift.

bluelightning2k · on June 16, 2022

I like CloudFlare - but still haven't heard back since signing up for Workers for Platforms the day it was announced

sam0x17 · on June 15, 2022

I love CloudFlare workers EXCEPT the ridiculous limit they place on outgoing web requests. It makes doing something like writing a distributed web scraper or crawler impractical, so poo poo on them.

erichocean · on June 15, 2022

> It makes doing something like writing a distributed web scraper or crawler impractical

That's why they have the limits.

nindalf · on June 16, 2022

Never thought I’d see a self-aware wolf in a technical discussion.

csomar · on June 16, 2022

I don't think that's what CloudFlare workers are for.

gigel82 · on June 15, 2022

> Under no circumstances should CloudFlare or anyone else be running multiple isolates in the same OS process.

That depends on your scenario. In our case, all the JavaScript code is ours so we're not worried about it trying to exploit bugs and escape into native. Running multiple Isolates / Contexts gives us isolation on the JS side but also lots of sharing (the v8::Platform object and several other globals are shared in the process).

Of course, if you're running untrusted JavaScript code what you're saying makes sense (though I wouldn't go as far as Firecracker, low-rights sandbox processes a-la Chrome do the job).

csomar · on June 16, 2022

My understanding is that you are worried about code other customers are running (since both of you are running in the same process); but I might be wrong/not getting the angle of all of this?

mrkurt · on June 15, 2022

Yes, fair. That makes complete sense.

ignoramous · on June 15, 2022

> Process isolation is slightly heavier weight (though forking is wicked fast) but more secure. Processes give you the advantage of using cgroups to restrict resources, namespaces to limit network access, etc. My understanding is that this is exactly what Deno Deploy does.

Interestingly, as does Android: https://stackoverflow.com/a/12703292

afiori · on June 15, 2022

> Under no circumstances should CloudFlare or anyone else be running multiple isolates in the same OS process. They need to be sandboxed in isolated processes. Chrome sandboxes them in isolated processes.

Is it because V8 isolates rely on the process sandbox or just to have a double sandbox?

from https://blog.cloudflare.com/cloud-computing-without-containe...

> Isolates are lightweight contexts that group variables with the code allowed to mutate them. Most importantly, a single process can run hundreds or thousands of Isolates, seamlessly switching between them.

Cloudflare runs multiple isolates per process.

> We also have a few layers of security built on our end, including various protections against timing attacks, but V8 is the real wonder that makes this compute model possible

They also talk about how they removed many browser APIs for security, but seems to heavily rely on V8 isolates for sandboxing.

mrkurt · on June 15, 2022

CloudFlare actually makes the case for running in dedicated processes on their own blog: https://blog.cloudflare.com/mitigating-spectre-and-other-sec...

Running untrusted code in the same process gives that code a tremendous blast radius if they exploit a vulnerability in, say, a fetch implementation. I do not understand why they would do this.

Isolating processes adds a layer of protection. People who exploit your implementation have limited access to the system (they can't read another user's memory, for example, which often contains sensitive info – like private keys).

KVM adds _another_ layer.

If you have a process running in a namespace within a KVM, someone would need to exploit the process, the Linux Kernel, and the underlying virtualization extensions to do serious damage.

tptacek · on June 15, 2022

The process, the Linux kernel, underlying virtualization extensions (maybe; not totally following that one) and the mandatory access control rules applied to the VM runtime --- in Firecracker's case, the BPF jail it runs in.

bitwize · on June 15, 2022

Belt, suspenders, antigravity device, as Steve Summit has been known to say.

megous · on June 15, 2022

> Under no circumstances should CloudFlare or anyone else be running multiple isolates in the same OS process.

TFA: "Most importantly, a single process can run hundreds or thousands of Isolates, seamlessly switching between them." :)

khuey · on June 15, 2022

And that's true as long as all the isolates trust each other.

amerine · on June 15, 2022

This is such a great answer, I just wanted to send a +1 in agreement about isolates eventually being able to leverage cpu virt. Much nodding happened while reading your answer.