Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Running Containers on AWS Lambda (earthly.dev)
170 points by shaicoleman on April 27, 2022 | hide | past | favorite | 78 comments


Author here. I didn't expect this to show up here. I found containers on lambda to work really well. I can easily test them locally and they scale to zero when I'm not using them in AWS. So it's perfect for things that are idle a lot.

I have a follow up coming where I use go in a container, and the request speed got a lot better.

The service in this article is my html to text convert, so having a container where I could install OS-dependencies was crucial to getting this working. It's covered here and here:

https://news.ycombinator.com/item?id=30829568

https://earthly-tools.com/text-mode


> I have a follow up coming where I use go in a container, and the request speed got a lot better.

My team transitioned a lightweight python based lambda from zip to container based and we also saw a small (but definitely real) speedup in request time. I'm not the one who did the benchmarking, but iirc it was about 10ms faster. From ~50ms to ~40ms or so.

edit: originally phrased it backwards to seem like container method was slower.


My experience has been the opposite, with simple python docker containers showing noticeably slower cold starts than python zips. I've also found the size of the zip package matters (smaller is faster). However things change over time, maybe there have been recent improvements?

Specifically regarding cold starts of lambda instances, I recall this excellent article from Jan 2021 which measures cold start latencies for a variety of AWS lambda runtimes:

https://mikhail.io/serverless/coldstarts/aws/languages/

I would love to see a 2022 update.


Our benchmarking was only for a to a warmed lambda, I don't believe any language can do a 50ms cold start on lambda. The fastest example in your link is python with around 150ms cold start as the fastest, and that's an outlier from the average.


Are you saying latency is lower on container than plain python?

I recently made a project for the company and didn't even consider containers (which I love btw) because I though I could save some ms without container.

Do you know if lambda containers performance is also good in nodejs? I need to have python and nodejs lambdas.


Versus the .zip file deploy method if that's what you mean, yes the container had a lower latency. The core code didn't change. This is just for the response time on a warmed-up lambda. We didn't benchmark cold-start times, but just from using it the cold-starts don't seem noticeably worse (Maybe they're also better? Honestly I don't know).

Sorry don't know about nodejs yet. This was for a new lambda that wasn't in production yet, so it made sense to experiment.

The increase in speed was unexpected but nice. But we're also assuming this increase is a static ~10ms speedup, and not a 20% speedup, but we only tested one lambda. I'm interested to see Adam's followup post and see if I'm wrong about that.

Oh also just want to say, since the workings of lambda are hidden, it could be some totally different reason why containers perform better. Since its a newer feature, maybe it's running a newer version of the backend that the zip file backend will be moved to. Maybe container-based lambdas are allocated on newer hardware. Maybe containers are actually just faster. Can't ever really know without getting hired by Amazon and signing an NDA, lol.


Someone here recently said that the Lambda Python runtime is not stock Python (apparently it's compiled without optimisations: https://news.ycombinator.com/item?id=30953202) - maybe that has something to do with it?


It definitely could, we are talking in the order of ~10ms, even the version of python over the years can mean such improvement (or better depending on the code of course).

I was interested on the cold starts though. I will have to experiment in the future.


I had much better experience with GCP Cloud Run. Prepare a OCI/Docker image, type `gcloud run` with a few flags and you’re done. In 2021 they added a bunch of features which in my opinion make Cloud Run one of the most trivial ways of deploying containers indented for production usage.


Simply the best cloud service from any of the three big CSP’s bar none. The day AWS supports this more out of the box on Lambda (no, I will not use your required base image or signatures) will be the day containers and serverless become one (like it is already on GCP).


Fly.io isn't one of the "three big CSP's" but they are awesome.


Prepare image…? Just let buildpacks sort it out and forget the Dockerfile ever existed.


We're trying this out at a large insurance company. Historically actuarial teams created excel workbooks, r code, and python. Then those models were given to development teams to implement in a different language. As one might guess there were loads of bugs and the process was slow. Now we're going to deploy an R lambda that owned by DevOps which integrates all the I/O into dataframes. The lambda calls a calculation in R that takes those dataframes and returns a dataframe answer. If all goes well (prototype works fine), we saved probably 500k and 6 months.


You’ll have to deal with lambda cold starts if you want it to be performant:

> When the Lambda service receives a request to run a function via the Lambda API, the service first prepares an execution environment. During this step, the service downloads the code for the function, which is stored in an internal Amazon S3 bucket (or in Amazon Elastic Container Registry if the function uses container packaging). It then creates an environment with the memory, runtime, and configuration specified. Once complete, Lambda runs any initialization code outside of the event handler before finally running the handler code.

https://aws.amazon.com/blogs/compute/operating-lambda-perfor...


It's not entirely accurate that Lambda pulls container images from ECR at start-up time. Here's me talking about what happens behind the scenes (which, in the real world, often makes things orders of magnitude faster than a full container pull): https://www.youtube.com/watch?v=A-7j0QlGwFk

But your broader point is correct. Cold starts are a challenge, but they're one that the team is constantly working on and improving. You can also help reduce cold-start time by picking languages without heavy VMs (Go, Rust, etc), but reducing work done in 'static' code, and by minimizing the size of your container image. All those things will get less important over time, but they all can have a huge impact on cold-starts now.

Another option is Lambda Provisioned concurrency, which allows you to pay a small amount to control how many sandboxes Lambda keeps warm on your behalf: https://docs.aws.amazon.com/lambda/latest/dg/provisioned-con...


Pardon the ignorance, but is the state of lambda containers considered to be single-threaded? Or can they serve requests in parallel?

If I had a Spring Java (well, Kotlin) app that processes stuff off SQS (large startup time but potentially very high parallelism), would you recommend running ECS containers and scale them up based on SQS back-pressure? Or would you package them up as Lambdas with provisioned capacity? Throughput will be fairly consistent (never zero) and occasionally bursty.


I would not use Spring, or Java for that matter, for lambdas, speaking from experience.

"Lambda containers" is a bit of a misnomer, as you can have multiple instances of a function run on the same container, it's just that initial startup time once the container shuts down that is slow (which can be somewhat avoided by a "warming" function set to trigger on a cron).

I would definitely go with containers if your intention is to use Spring. ECS containers can autoscale just the same as lambdas.

There's some work being done to package Java code to run more efficiently in serverless computing environments, but IIRC, it's not there yet.


Thanks! I wasn't planning it, but can't hurt to ask.

When I looked the Lambda API looked uncomplicated to implement (I saw an example somewhere) and it felt like you could just write a few controllers and gain the ability to run a subset of functionality in Lambda, especially if your app could be kept warm.

(to your cron comment, I thought that the reserved capacity would mean the container would be forcibly kept warm?)


Provisioned concurrency is nice, but can get pricey, especially in an autoscaling scenario. It moves you from a pay-per-usage situation to hourly fee + usage model. I would wait until your requirements show you absolutely need it. For most use cases, you will either have enough traffic to keep the lambda warm, or can incur the cost of the cold start. Warming functions did the trick for us. If you think about it provisioned concurrency is paying for managed warmers.


Spring is a one thing, Java is really another. One can use Java without reflection, and then the cold starts are really reduced. Additionally, there's a GraalVm which is optimized VM which should be even faster. On top of that, if the reflection is not used, these days one can compile Java to the native image, which has none of these problems.


When you say fast though, you really are talking in comparison to other methods of using Java on Lambda. But compared to using something like Go, they are all slow.


Each container handles requests serially. This doesn’t preclude you from spawning multiple threads in Lambda to do background work though.


Serially, but up to ten requests in a single batch

> By default, Lambda polls up to 10 messages in your queue at once and sends that batch to your function.

From https://docs.aws.amazon.com/lambda/latest/dg/with-sqs.html


I'm not an expert in this area, but have you all considered using CRIU[0] (checkpoint restore in userspace) for container-based Lambdas to allow users to snapshot their containers after most of the a language's VM (like Python) has performed its startup? Do you think this would reduce startup times?

0. https://criu.org/Docker


That's a good question!

Accelerating cold starts with checkpoint and restore is a good idea. There's been a lot of research in academia around it, and some progress in industry too. It's one of those things, though, that works really well for specific use-cases or at small scale, but take a lot of work to generalize and scale up.

For example, one challenge is making sure that random number generators (RNGs) don't ever return the same values ever after cloning (because that completely breaks GCM mode, for example). More details here: https://arxiv.org/abs/2102.12892

As for CRIU specifically, it turned out not to be the right fit for Lambda, because Lambda lets you create multiple processes, interact with the OS in various ways, store local state, and other things that CRIU doesn't model in the way we needed. It's cool stuff, though, and likely a good fit for other use-cases.


They have a feature called "provisioned concurrency" where basically one "instance" of your lambda (or however many you want to configure) stays running warm, so that it can handle requests quickly.

I know it defeats the conceptual purpose of serverless, but it's a nice workaround while cloud platforms work on mitigating the cold start problem.


That'll also cost you $$$$ and takes any provisioned lambda out of the free tier. Also note that only your specified number of instances will stay warm, meaning if your lambda needs to scale up, you risk slow cold starts on additional instances outside of the number you provisioned. You could specify the number of reserved concurrent images (limiting the number that run) but that also costs money and will eat int your quota.

Using containers for lambda is generally a bad idea for anything TypeScript/JavaScript that handles a realtime request - you just can't beat the speed of a single JavaScript file (compiled, in the case of TS). AWS CDK now ships with the NodeJSFunction as well, which makes generating those a breeze with ESBuild.


It cost me like $3 a month to get benefit from it. For what it's worth I don't use it for synchronous web requests.


I've had some pretty good luck sliming things down as well. That's a win win usually for even non-lambda cases (trying things like docker-slim / switching stuff to go that needs a quick response).

That said, the $2-5/month is fine as well for some cases.


It’s nice to have that dial. Running lambda with provisioned concurrency is still a very managed experience: much different than running a container cluster.


If cold starts are at all an issue for whatever use-case, you can just do a warming job like we do (in our case it's built into Ruby on Jets). We find invoking every 30 seconds is enough to never have a cold start. It's still quite cheap as well. The lambda portion of our bill (with tons of platform usage) is still incredibly low / low double digits.

Just doing a warming job with no other usage falls well within free tier usage, I can confirm.


This is definitely an issue especially with infrequently accessed functions but I've seen cold start issues regardless. I assume some scaling events will cause cold starts (measured in seconds).

There's a good argument to go with packaged code instead of containers if you can manage the development complication and versioning (cold starts measured in milliseconds).


My team owns a Node 14 JS lambda application that is completely serverless. We’re focused on keeping our lambdas small with single responsibilities and leverage lambda layers for anything common across multiple lambdas. Cold starts are a concern, but is very negligible (< 100ms tops) and unnoticed by our client apps. We host all of our static web and JS assets via Cloud Front so they load quickly. If a user happened to visit our site when a lambda incurred a cold start, it’s not perceptible. We were much more worried initially about cold starts than it turned out we needed to be. Keeping lambdas small, picking a good language for the job, and leveraging lambda layers help minimize this a lot.


Lambda is Greek for CGI script.


In a way, sure.

But Lambda also does things that CGI didn't do: dynamic auto-scaling, integration with queues and streams for async processing, strong security isolation with fine-grained authorization, host failure detection and recovery, datacenter failure resilience, and many other things. The interface being familiar and relatively analogous to existing things is intentional. The point is to be different where being different helps, and familiar where being different doesn't help.


You get like 90% of that dropping your CGI directory on NFS shared among multiple hosts. AWS offers a full featured host environment, so you don't have to build that part, but the part you write is quite similar.


Did CGI allow you to have your individual CGI scripts in separate containers?


If it talks via FastCGI you could have individual containers per whatever unit of segmentation you'd like.


As someone who has been around long enough to actually remember setting up /cgi-bin/... there is a lot more to lambdas.

They are scalable (including, importantly, down to zero!), have their own dedicated resources for each process, and are pretty efficient at being put to sleep and woken up. Plus the code is immutable by default and you get a lot out of the box like logging and setting limits.

I wouldn't start a new project with CGI at all right now but I use Lambda constantly.


In addition to what the other replies said I'd like to offer the following observation:

Lambda is the control plane running inside an "AWS OS" context, that means it has access to internal apis with scoped permissions. Most commonly people discuss user facing lambdas on the edge, however you are not obligated to expose it to the world.

If you do choose to go the cloud route understand that your account activities generate quite a lot of data. Simplest example would be custom CloudWatch events generated from say autoscaling groups i.e. "Server 23 has ram usage above some moving average threshold" => kick off lambda => custom logic (send email/slack, reboot server, spin up more instances, whatever).

People who like middle brow dismissals would say "what does it matter where the script runs, it could just as easily be running on the instance itself" - to them I say, pain is the best teacher. :)


We are using a container hosting .net6 with our lambda. We use it where I think lambdas really work well, that is to process queue items off of SQS. It works well with the dead-letter queue as well. We dont notice any performance issues, but this is just a processor, so we don't need to worry about real-time responses either.


What was the thought behind processing SQS messages with a dotnet 6 containized lambda instead of a Node or Python lambda?


Have you used .NET recently? This is sort of a weird question. Not everyone uses node or python.


I found using AWS Copilot to deploy to AWS Fargate easy to deploy, maintain and scale.


Until it it fails, and the backing CF errors and won’t resolve itself for over 24rs. Bit twice, never again with Copilot. Good idea they had, just shaky foundation.


CF = Cloud Formation?


I went down the rabbithole of wanting to build my own lightweight Lambda, only to wonder if Lambda is just distributed CGI by another name.


I think CGI is a good high level way of think about AWS Lambda and other Serverless compute platforms. I think the key innovations are the integrations with cloud services, and the scaling/economic model. Videos like the one linked below really demonstrate the level of optimizations implemented to make this work on the "cloud" scale. I think Serverless Compute platforms are going to become a really meaningful part of the software development ecosystem.

[1] https://www.youtube.com/watch?v=A-7j0QlGwFk&ab_channel=Insid...


Feels pretty similar to any typical fastcgi implementation with pools, scaling up/down, separate spaces for initialization code and per-request code, etc.


And, was it?


No offence but Cloud Run has been doing this for a while?

And now Cloud Functions gen 2 as well…?


I'v found that using this will cause Lambda sometimes to return 500 errors while it's reloading the container image from the registry. This might be the price for allowing large images, they've decided not to do it in a blocking way.


How long does it take to fetch the container - is it warm or cold? For AWS Batch it was taking me 1-3 min. So I was really surprised/ happy to see this lambda container post.


It's warm - when you change the ImageUri or "Update Code" for the lambda definition, it downloads the container into "somewhere" lambda-y - this takes a few seconds depending on size. Startups are fairly quick, but because of the way it persists your running image in memory, your container is generally (on frequent usage) 'paused' between invocations, and resumes quite quickly.


Yeah, that's right. I talk about some of the details here: https://www.youtube.com/watch?v=A-7j0QlGwFk

We're working on some more formal documentation of this, both to help customers understand how to optimize things, and to join the research conversation about container loading. Our solution combines deterministic flattening, demand loading, erasure coding, convergent encryption, and a couple of other cool technologies to both accelerate the average case and drive down tail latency.


Thanks for the share, useful to have some of the details explained!


so cost-wise that's probably equivalent to running a server? but you get all of the nice lambda integrations like APIGW


Cost wise you are just paying for the time and memory when the lambda is running. So the image may be 'close' to the lambda but it's billed like other lambdas.


You can test it out here although it might be hard to get a cold timing for it if many are using it:

https://earthly-tools.com/text-mode


Is it possible to host an app like Django inside container on lambda? This could help the Django/postgres apps to scale horizontally easily.


Not sure about Django but on the Node.js side there are Express.js compatible libraries that let you write your app like you would Express but it's exposed via Lambda and API gateway. Good chance Python has something similar. Biggest difference is you're not dealing with HTTP (directly) but it can be abstracted to -- from a framework perspective -- act like HTTP.


I am actually working on a python library that works like you described at https://cdevframework.io! The goal is the provided a developer experience like Django that then deploys onto Serverless Infrastructure. It is a long way from that goal, but one day it will get there!


Yes, but is the startup time of Django an issue? Besides that you'd have to return your data in the shape that the API gateway expects from lambdas.

Returning a 500 would look something like this:

    {
     "statusCode" : 500,
      "headers" : {
        "content-type": "text/plain; charset=utf-8"
     },
     "body" : "Some error fetching the content"
   }


Yes, it's possible to wrap any asgi app to run in a lambda. Check out Mangum https://github.com/jordaneremieff/mangum


Yes it absolutely is possible. I have been considering doing this in combination with PG Aurora v2


Is Aurora v2 still inaccessible from outside the VPC? The fact that I have to set up some kind of proxy EC2 in order to view the database from my machine is a non-starter for me.


You can run any wsgi application in lambda with libraries like https://pypi.org/project/apig-wsgi/


Not Django, but AWS has a product focusing on this - Chalice https://github.com/aws/chalice


Pretty sure Zappa does this: https://github.com/zappa/Zappa


I have an entire video analytics pipeline running about a dozen containers for inference. Works great.


You and me both! Works great, i only wish that the timeout of 15 minutes could be increased…


I very carefully batch things / break them up to make it work well since none of the models I am using individually need 15 min for a single frame.


This plagues me almost daily. I would love for this limit to be increased as well ...


“Why run on lambda instead of fargate? Oops, we won’t tell you.” - AWS


This is going to throw unknowing readers for a loop, because it's a comment trying to be cheeky.

Simply put: Fargate/ECS/EC2+EB = long running tasks

Lambda = Short burst tasks with a max life of 15 minutes

Running a lambda 24/7 will nuke your credit card. Using Fargate for a scheduler/cron job that only runs 4 times a day will nuke your credit card. Use the right tool for the right job.


AWS datapipeline is likely the best solution for a longer running batch that gets run 3-4 times a day.


It's definitely an option. Wrapped services are almost always more expensive, so careful attention to pricing details is needed.


The bulk of the cost here is usually on the EC2 side. The main extra cost is the price it takes your runner EC2 instance(s) to bootstrap - ie usually about 2-3 minutes. Can be mitigated by using prepaid discounts, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: