A journey into running Corda 5 on an Apple Silicon Mac or other ARM processor-based hosts
Write Once, Run Anywhere
A couple of months back, a brand new shiny ARM-based Apple Silicon Mac dropped onto my desk here at R3. Figuratively that is, first impressions were that it’s a little more “robust” than the previous generation of Intel Macs, so I suspect that had it been literally dropped, the desk would have suffered more than the laptop. But truth be told, I do tend to cart my laptops around a fair bit and some robustness never hurt. I found it in no small part ironic that technology designed to rid ourselves of the shackles of cross platform compatibility and inconsistent environment problems became the stumbling block in running Corda 5 on my new Mac hardware. But a little perseverance and some pull requests later and the results are that I am now happily using this Mac as my day to day development machine on the Corda 5 project.
I have been a Mac user for well over a decade and I can reveal that I have a bit of a soft spot for ARM processors too. A good chunk of my career has been spent building all sorts of interesting ARM-based devices. Their history as a company dates back to the vibrant home-computing scene of 1980s UK, when many of us realised we’d probably end up doing this for a living one day.
So the combination of the two things was something that I was very keen to try out. And by the way, Amazon are not so far behind with the thinking that there’s a future for ARM processors outside embedded devices either. Plus Corda 5 is pretty versatile, maybe one day we will see a use case to have it run on embedded devices. So more speed with less power – what’s not to like? Well for one, you have to rebuild all your native code. That shouldn’t be a problem for Corda as it’s JVM based, so write once, run anywhere. Well it turns out that’s sort of true
Some time back, the distributed and web service oriented software world started releasing its software via a thing called Docker Containers. Docker promises “Run Anywhere” and it’s just about the first statement that you see when you visit their site. There is truth in that statement and there’s also no doubt that running software in containers has made immeasurable improvements in the ease of distribution, deployment, and orchestration of web services and other software components.
Docker containers, for those who are unaware of them, are based around Linux Kernel extensions. They create sandbox environments on a host Linux machine in which you can run services as if they are running on the same machine, in the same state every time you launch it anywhere. They also facilitate scalability quite nicely. Imagine you have a simple stateless service responding with some immutable data on request. Put that service in a container and a load balancer in front of a bunch of such containers and you now have multiple versions of the exact same service, each processing a share of the work in exactly the same way. Of course the architecture needs to become more complicated as the functionality provided by those services becomes less simple and data becomes mutable, but the principle remains similar.
To Docker’s advantage, back in the hedonistic days of 2020, if you were building software in between watching episodes of Tiger King, just about any mainstream computer that you were building it on would likely have an Intel architecture chipset. That meant with advances in virtualization, even the OS became a formality when it came to building containers for software deployment. Whilst typically containers used in production were deployed on cloud VMs running Linux, it was also no problem for development purposes to fire up Docker Desktop on a Mac and have Docker create virtualized Linux VMs seamlessly to run your containers in.
So how does this relate to Corda 5?
One thing the containerisation world probably didn’t consider a priority when it set out is the uptake of an alternative processor architecture for desktop computers.
My colleagues at R3 have already touched on the importance of high availability and horizontal scalability with regards to the Corda 5 architecture in other blog posts. It should be relatively clear by now that a product which utilises container orchestration to realise both of those goals is at the mercy of the processor architecture of those containers to an extent. If you tried to follow the Corda 5 quick start guide a few months ago you would have got so far, and then started running into problems trying to publish and launch some of the containers on an ARM device. The good news is many of those problems have been resolved in the Corda 5 build system already. The other good news is that those that haven’t are in the hands of our cluster management team and will be resolved shortly. The OK news is that you right now you can already work around the remaining issues with just a few tweaks.
How does Docker handle platform architecture?
This leap into cross platform architecture puts us back in the world where we suddenly now have to care about how we deploy software again. Less so because of the way we write it and build it, even C++ is sort of portable (squint a bit and avoid the bits of Boost which are written in assembler) but instead because of the processor architecture on which we execute it. We are left with a situation where Docker desktop is fully functional for Apple Silicon Macs, however it cannot virtualize Intel Linux virtual machines, only ARM ones. That’s because virtualization is not emulation.
Before we delve into what does and doesn’t work in Corda 5, it would be helpful to learn a little about how Docker handles ARM support. The best source of information for this is unsurprisingly the Docker page on Apple Silicon support and their page on Docker manifests. In short Docker manifests are files that act as an abstraction between the image that will be pulled down and the request made to Docker Hub. The manifest contains information about various Docker images available in Docker Hub, and which image is pulled down locally is determined by filtering on that information. When an ARM host attempts to pull an image via a manifest, should the manifest provide details of an ARM image, the Docker client will then pull that ARM image instead of the Intel based one. If no ARM image exists, Docker will pull the Intel one and run it under emulation provided by QEmu under the hood.
Whilst this emulation will work to a point, it is really a stop gap solution to get something running, it is not advised by Docker to rely on this in your products given they know a number of features will be unsupported. I will save you the trouble of trying this yourself, and tell you if you were to attempt to run Corda 5 Docker images under QEmu, it will likely not start up. We will not dwell on why here, the goal should be to get Corda 5 running on ARM images.
What works out of the box and what doesn’t?
You are able to build Corda 5 on an ARM host in exactly the same way that you can build it on an Intel host. Build in this context means compilation of the Kotlin code so that it runs on a JVM. Corda 5 requires that you build it on a Java 11 JDK. The quick start quide explains that our chosen JDK is the one from Azul. Azul provide ARM-compiled JDK’s for both Linux and Apple Silicon Macs from their download page. So all is good in this regard.
The next good news is that running Corda 5 as a single process for development (using an in-process message bus rather than Kafka) works in exactly the same way on an ARM host as is does an Intel host. We call running a single process for development the “combined worker.” I wont dwell on using the combined worker here, but you can find it in the IntelliJ run configurations, which are present when you open the Corda 5 project in IntelliJ.
Following some work on the build system and with some alignment with our friends at Azul, you can also now publish Corda 5, i.e. run the
publishOSGiImage task on an Apple Silicon Mac just as described in the local development guide. The reason that you can do this, is that Azul have built and published their own ARM Docker images to Docker Hub based on their ARM JDK v11 at the request of R3! When you publish Corda 5 Docker images using that
gradle task, you are basing those Corda images on a base image which contains a working JDK provided by Azul. Had Azul not provided ARM versions of the JDK via a Docker manifest, Docker Desktop would have pulled down Intel images and run them under emulation. In fact, this is what happened before attention was given to ARM support for Corda 5 and hence where this journey started.
Now we get to the part where the not so good news starts. What you cannot yet do, is follow that local development guide to get the Corda “pre-requisites” required to run a Corda 5 Kubernetes cluster.
ARM Yourself! (aka just tell me how to get it working)
This blog should be read in conjunction with the “Install Corda pre-requisites” section from the local development guide. In addition to that particular section, please generally follow all instructions given in the local development guide.
Some basic Kubernetes and Helm knowledge is assumed here. For a start, you should at least know what those things are and also have some knowledge on how to interact with them from the command line. That is true of Corda 5 development generally.
There is an assumption in the instructions, that you already have a running Kubernetes cluster on your Mac. These steps were tested against the Kubernetes cluster, which comes with Docker Desktop. It is therefore recommended that you do the same.
Firstly, we cannot use the Kafka part of the Helm chart from the pre-requisite Helm chart provided in the corda-dev-helm repository. The Bitnami chart has no ARM support and although this is a well-requested feature, there is no movement on this right now.
Corda 5 puts no special requirements on a Kafka deployment, so pointing it at an off the shelf deployment which runs on an ARM based Mac is sufficient. When Corda 5 starts up, it creates all the topics that it needs for operation as part of an InitContainer. Corda by default, will try to connect to a Kafka bootstrap server on
prereqs-kafka port 9092, but even this is configurable by applying the correct value to the Corda helm chart.
- name: kafka
- name: ZOOKEEPER_HOST
args: ["/etc/kafka/server.properties", "--override", "advertised.listeners=PLAINTEXT://prereqs-kafka:9092"]
- containerPort: 9092
- port: 9092
- name: zookeeper
- containerPort: 2181
- port: 2181
If you don’t know what this does and want to know, the Kubernetes documentation is the correct place to start. Kafka requires an external service called ZooKeeper to be running which is why you see references to that in this declaration. ZooKeeper is a generic service for administering metadata about distributed systems. If you want to know about ZooKeeper and how Kafka uses it, the Kafka documentation will explain all. A few minutes of reading around these topics should enable you to understand what this yaml file is telling Kubernetes to do. It is fairly standard stuff.
To apply this to your Kubernetes cluster, now type:
kubectl apply -f kafka.yaml -n corda
Again, reading the Kubernetes documentation will explain what is happening here. After successful completion, you now have Kafka on your cluster.
Similarly to Kafka, Corda 5 does not impose any special requirements towards a Postgres deployment. Here you are shown how to use the official Postgres image, although we still utilise the Bitnami chart, which will rather handily configure everything for us too.
Just like Kafka, there is no Bitnami ARM image. However, the Bitnami helm chart allows you to replace the use of their own Postgres image with an official Postgres image if you so choose. There are ARM official Postgres images already published. For this reason you should follow the local development guide where it describes cloning the R3 corda-dev-helm repository and where it explains how to add the Bitnami respository to your Helm installation.
The bit that you need to ignore is the part where it tells you to type
helm install prereqs -n corda charts/corda-dev. Instead you are going to run that command with some extra options to tell the Helm chart to ignore Kafka completely (we installed this already) and to use the official Postgres image instead of the Bitnami one. The official image requires some extra configuration too, so all this needs to be passed to Helm when executing the install command.
What this means in practice, is that you need to execute the following from the command line:
helm install prereqs -n corda \
--set kafka.enabled=false \
--set postgresql.image.repository=postgres,postgresql.image.tag=10.6 \
--set postgresql.postgresqlDataDir=/var/lib/postgresql/data/pgdata \
--set postgresql.persistence.mountPath=/var/lib/postgresql/data \
--set postgresql.volumePermissions.image.repository=alpine \
--set postgresql.volumePermissions.image.tag="3.10" \
--set postgresql.primary.initdb.scripts=null \
--timeout 10m \
Deploying Corda 5
Whilst Corda 5 is built and published in the same way on an ARM host as an Intel host, we must tweak its installation into the cluster a little. This is only because we’ve played around with the pre-requisites above and we need to specifically tell the Corda Helm chart that our Canonical Kafka does not support TLS or SASL. Where the guide tells you to execute
helm install corda -n corda charts/corda --values values.yaml --wait from the root of the
corda-runtime-os repository on your local machine, instead you must execute it with a single extra parameter:
helm install corda -n corda \
--values values.yaml \
--set kafka.tls.enabled=false \
--set kafka.sasl.enabled=false \
So now you have Kafka, Postgres and Corda running in your cluster. You’ll probably want to jump back into the standard Corda 5 developer guide now and get going on building and installing a CorDapp into your cluster. The process for doing this is the same on an ARM host as an Intel host.
A number of developers working on Corda 5 here at R3 are already using Apple Silicon Macs as their day-to-day development machines and are following these guidelines alongside the usual Corda 5 ones. Please do not be deterred from giving this a try though, because this blog demonstrates clearly that because of the work already done on Corda 5, only a few relatively minor tweaks are required to get you up and running.
ARMed and Dangerous
So hopefully you have a running Corda 5 cluster on your Mac now and because all ARM puns are likely now used, this is a good place to wrap up and explain our status relating to ARM support at R3.
- The Corda 5 Cluster Management team are busy pondering over how to align Intel and ARM host images for the pre-requisites, so one day soon, the instructions outlined in this blog will become moot. Users developing against Corda 5 on a Mac will have the exact same experience as on an any other Intel-based device.
- The Corda Infrastructure team now build Corda 5 on an ARM instance on AWS as a nightly build, so we now officially support ARM at the build stage. Any ARM-specific problems not picked up during local development will additionally be flagged to R3 developers if this nightly build fails.
The rest is open to our customers. If you have an ARM-based project that you feel might benefit from using Corda 5, I would encourage you to give it a go. Our product managers would love to hear from anyone who’d like official support for Corda on ARM. For instance, at CordaCon 2022, two collaborating third-party organisations gave a talk on using Corda to enable pay-per-use transactions on embedded devices in the manufacturing sector. Perhaps your project could run Corda directly on such devices, reducing the need for external Intel-based hardware. Perhaps you have other ideas for it that we’ve not yet thought of. Maybe you just like playing with dev boards and think this sounds cool. Corda 5 is open source, so if you can play with it and invent new ways of using it for yourselves, please do so. Then be sure to let us know!