Modern Java in the Cloud, Part 2: Containerize It

So, after you’ve read the first article in this series, you’ve now got your application up and running in the MicroProfile compliant runtime of your choice. Now what?

The industry standard nowadays is putting your app into a container and then putting that container somewhere. The latter part will be the topic of the next article in our series, so let’s focus on the container part for now.

Several of the runtimes provide a Dockerfile for you when downloading their starter. This is a Dockerfile suited for what’s considered the typical use case. It typically takes for granted that you’ve built your code, typically with Maven or Gradle, somewhere else, and now you’ve got a JAR, a WAR or similar that you’re ready to deploy.

If that fits your scenario, great. You can probably move on with the provided Dockerfile.

Base Image

However, there are a few topics we should visit. First, what is the base image you’re running on? A rule of thumb is that you should aim for a tiny base image with only exactly what you need to run your code. This is partly because of speed, memory footprint and cost, but from my point of view mainly about security. Everything that’s inside your running container can be exploited, whilst whatever is not in your container obviously can not. That means that you probably should not use a Debian distro as your base image, and that you should avoid including for instance a terminal unless you really need it.

Layers

We also need to talk about layers. A container is built with layers — you start from a base image and add your own layers on top of it. These layers are cached during container build time, so whenever you rebuild your container, Docker (or Podman or whatever you’re using) will reuse the cached layers if they haven’t changed. You can utilize this when creating or tuning your Dockerfile, as there are parts of your application that change way more frequently than others. You probably change the versions of your libraries less often than you change your own code, for instance, so you’ll be better off downloading your libraries in a separate step that is executed before your source code is compiled. With Maven, you can do this by first copying your POM file into your container, run for instance mvn verify to download the dependencies, and only then copy your source code and run the rest of your build.

Native?

If you’re creating something that should run in native mode, using GraalVM, there are other circumstances you’ll need to consider as well. You should build your code on the same architecture as you intend to run it on. There are several ways to do this, for instance by specifying which operating system the build should run on if you’re building with GitHub Actions, although what to me sounds to be the safest way of doing it is to build your code inside a container as well, so that you’ll have full control.

To do that, you’ll need to fine tune your Dockerfile, and you definitely need a multi-stage build. An example of how to do it, hardened and tested in production for years, can be found on my GitHub, at https://github.com/madsop/quarkus-multistage/blob/main/Dockerfile. Note also how everything is done in several steps to use the layer mechanism.

Build the Stuff with Google Cloud Build

Once you’re happy with your container setup, you need something that converts your config into an actual image. There are lots of good options here, including Azure Functions, GitHub Actions, Jenkins and many others. I’ve used Google Cloud Build, as it plays well together with Google Cloud Run and is also tunable if you need heavy build nodes, as we’ll see in the final parts of this series.

You can utilize Kaniko for caching the container layers we visited in the previous part, which can save you both build time and money. This way, you will only rebuild the layers that have actually changed since the last build.

Cloud Build lets you configure through your Dockerfile, buildpacks or through a vendor-specific cloudbuild.yaml file. I have no experience with buildpacks myself, so I won’t advise on them, but I prefer cloudbuild.yaml over using Dockerfiles directly. It’s more flexible and allows you to add config such as the above-mentioned Kaniko caching, or to state that you want your build to be run on a faster machine. The syntax and ideas of the cloudbuild.yaml file are similar to a Jenkinsfile or the YAML files you use for GitHub Actions, as you can see in my example at https://github.com/madsop/quarkus-multistage/blob/main/cloudbuild.yaml

Run It with Serverless

The tech stack we’re assembling in this series can run a wide variety of applications, from the smallest Hello worlds to huge critical enterprise monoliths. I’d like to put some emphasis on the use case where you’ll want a backend-for-frontend, as I’ve found this to be an area where Java or Kotlin has not been heavily present.

If you have a steady flow of users, you can keep your backend running 24/7 and have an always-running-always-available application. In that case, serverless probably isn’t the right tool. If on the other hand you have a user every now and then, or a huge part of your requests on Sunday nights, or only in the winter, serverless sounds like an opportunity you should consider.

As the characteristics of serverless include that your application automatically scales up or down depending on the traffic, startup times are crucial. Thus, it might be suitable with a runtime that starts up very fast. We’ll have a close look at one such runtime in our next article.


Modern Java in the Cloud, Part 2: Containerize It was originally published in Compendium on Medium, where people are continuing the conversation by highlighting and responding to this story.