Armv9-A: How our Kinibi 600 Trusted OS utilises MTE and FF-A features to create state-of-the-art TEEs

5 June 2023

In February of this year, we published a blog post providing an overview of Armv9-A – the latest iteration of Arm’s processor architecture for a broad range of devices, including smartphones, laptops, vehicles, and Internet of Things [IoT] devices.

Launched in March 2021, Armv9-A represents ARM’s vision for what it hopes will be the computing platform for the next 300 billion chips across the next decade. The post discussed some of the key extensions added to Armv9-A to build upon the offering, such as Privileged Access Never [PAN], pointer authentication codes [PAC], Branch Target Identification [BTI], etc.

From a Trustonic perspective, however, the two most important features of Armv9-A are its Memory Tagging Extension [MTE] and its Secure Hypervisor leading to the Firmware Framework for Cortex-A [FF-A]. This is because these features both form key elements of our Kinibi 600 Operating System [OS], which helps to form a Trusted Execution Environment [TEE] in the correct environment.

Given the complexities of MTE and FF-A, we will be taking a deeper dive into these features in this blog post, explaining why we decided to implement them into Kinibi 600.

What is MTE?

Introduced by Arm in the Armv8.5-A architecture update, MTE is a new hardware mechanism that enables users to tag regions of the virtual memory.

When a memory region is tagged, it can be accessed only via virtual addresses that contain the same tag. If the region is accessed with a different tag, the central processing unit [CPU] prevents access and raises an abort.

MTE has a considerable hardware cost, and thus uses reasonable limits: an operating system can use 16 different MTE tags per application, and the minimum size of a memory region is 16 bytes.

It is already available for Linux applications, and can use mmap() with PROT_MTE to allocate a memory area with the support of tagging. MTE is also supported by the clang toolchain and can be used to protect the stacks with the new ‘-fsanitize=memtag’ option.

How Trustonic has used MTE in Kinibi 600

With Kinibi, we wanted to offer the new MTE feature to the Trusted Applications [TAs] within the TEE itself. The idea was to make the adoption of MTE as painless as possible for the TAs.

When the hardware supports MTE, it is automatically enabled for the dynamic memory management of the TAs. This means that the buffers allocated with TEE_Malloc() are automatically tagged, and each call to TEE_Malloc () returns a buffer with a different tag.

Therefore, if the application is attempting to access the buffer beyond the allocated size, the CPU will detect it and immediately stop the TA. MTE will also detect when a buffer is accessed after being freed, or when it is freed twice.

By detecting all these common mistakes, MTE demonstrates its power, and TAs will benefit from it without changing one line of code. The structure of the heap – or the metadata – is always protected by a special tag.

As a result, the application cannot access the metadata by mistake.

This process is illustrated by the graph below:

Armv9-A: How our Kinibi 600 Trusted OS utilises MTE and FF-A features to create state-of-the-art TEEs

When we enabled MTE for the dynamic allocations of Kinibi 600, we fixed several issues in the TEE.

During the project, we had to rewrite a large part of our dynamic allocator [libheap] to make it compatible with the MTE. We took the opportunity of this development to increase the code coverage of our unit tests and reached a coverage rate of 99.6% for the lines of code of the library.

When we started the project, we didn’t have any available hardware with MTE, and we therefore decided to use Quick Emulator [QEMU]. It is possible to start QEMU with MTE with the ‘mte=on’ option.

MTE can be independently enabled and used both in the Normal World – Linux and user space applications – and in the Secure World – the TEE and TAs. It is particularly convenient to be able to use QEMU with new hardware features when the real hardware is not yet available.

The following code snippets show the effect of using MTE in QEMU for dynamic allocations:

1. Allocation of two buffers

TEE_Result TA_EXPORT TA_CreateEntryPoint(void)
{
void* ptr1 = TEE_Malloc(1, TEE_MALLOC_FILL_ZERO);
assert(ptr1 != NULL);
TEE_DbgPrintLnf(“%s ptr1=%p”, __func__, ptr1);
void* ptr2 = TEE_Malloc(1, TEE_MALLOC_FILL_ZERO);
assert(ptr2 != NULL);
TEE_DbgPrintLnf(“%s ptr2=%p”, __func__, ptr2);
return TEE_SUCCESS;
}

The console output in Qemu:

[ 3566.647101] Trustonic TEE: 601(0)|TA_CreateEntryPoint ptr1=0x0d0000000060b030
[ 3566.648000] Trustonic TEE: 601(0)|TA_CreateEntryPoint ptr2=0x090000000060b050

The first buffer is tagged with 0xd while the second buffer is tagged with 0x9.

2. Buffer overflow

TEE_Result TA_EXPORT TA_CreateEntryPoint(void)
{
void* ptr1 = TEE_Malloc(1, TEE_MALLOC_FILL_ZERO);
assert(ptr1 != NULL);
TEE_DbgPrintLnf(“%s ptr1=%p”, __func__, ptr1);
memset(ptr1, 0, 1 + 16);
return TEE_SUCCESS;
}

The console output in QEMU:

[ 3905.395177] Trustonic TEE: 601(0)|TA_CreateEntryPoint ptr1=0x0e000000000bf030
[ 3905.400372] Trustonic TEE: 103(0)|FAULT in 601, thread 0x10006, UUID=78ecdb21-7316-51a2-a74f-ee5bf3749fd1
[ 3905.401420] Trustonic TEE: 103(0)|EXCH: trap_type=0x2 (TRAP_SEGMENTATION), trap_data=0xbf040
[ 3905.402206] Trustonic TEE: mtk(0)|MTK: EXCEPTION in thread 0x00010006, cpsr=0x60000000 [EL0/SP0,CZ]
[ 3905.405380] Trustonic TEE: mtk(0)|MTK: ESR=0x92000051 [DATA_ABORT_LOWER_EL, 32-bit instruction trapped, ISS=0x051, Recoverable error, write, Synchronous Tag Check]
[ 3905.408284] Trustonic TEE: mtk(0)|MTK: FAR=0x0E000000000BF040

We can see that the Trusted Application is killed by the TEE because it is trying to access beyond the allocated buffer with the same tag.

3. Use after free

TEE_Result TA_EXPORT TA_CreateEntryPoint(void)
{
void* ptr1 = TEE_Malloc(16, TEE_MALLOC_FILL_ZERO );
assert(ptr1 != NULL);
TEE_DbgPrintLnf(“%s ptr1=%p”, __func__, ptr1);
TEE_Free(ptr1);
memset(ptr1, 0, 16);
return TEE_SUCCESS;
}

The console output in Qemu:

[ 6340.879143] Trustonic TEE: 601(0)|TA_CreateEntryPoint ptr1=0x0800000000511030
[ 6340.880156] Trustonic TEE: 103(0)|FAULT in 601, thread 0x10006, UUID=78ecdb21-7316-51a2-a74f-ee5bf3749fd1
[ 6340.881456] Trustonic TEE: 103(0)|EXCH: trap_type=0x2 (TRAP_SEGMENTATION), trap_data=0x511030
[ 6340.885543] Trustonic TEE: mtk(0)|MTK: EXCEPTION in thread 0x00010006, cpsr=0x20000000 [EL0/SP0,C]
[ 6340.888074] Trustonic TEE: mtk(0)|MTK: ESR=0x92000051 [DATA_ABORT_LOWER_EL, 32-bit instruction trapped, ISS=0x051, Recoverable error, write, Synchronous Tag Check]
[ 6340.890146] Trustonic TEE: mtk(0)|MTK: FAR=0x0800000000511030

Here, we can see that the TA is killed by the TEE because it is trying to access a buffer which has been previously freed.

MTE provides a good level of protection against attacks, while simultaneously delivering an effective way to make code more robust. What’s more, TAs can also use MTE to protect their stacks. To do so, the TAs must be recompiled with the appropriate option. In summary, we strongly recommend that developers use MTE.

Not only is the feature highly scalable, but we did not experience any performance issues throughout our project.

What is FF-A?

In 2017, ARM announced Armv8.4, the most prominent feature of which was the Secure Hypervisor in SEL2. We have since been in regular contact with Arm to discuss the implications of this for the TEE and, in 2020, we set out our vision for how Hypervisors in the TEE would be utilised.

Alongside the change in hardware, ARM announced an open-source Hypervisor called Hafnium, which was designed to use a standardised communication protocol that ARM put out for open discussion.

Eventually, Hafnium came to be known as FF-A, and support for it has subsequently been added to the Linux kernel, Trusted-Firmware A [TF-A] – EL3 – and Kinibi. FF-A is supposed to work at any level of the architecture and to work even if some components are not present, such as in legacy systems.

To achieve this, ARM contacted major players in the ecosystem – including Trustonic – and discussed the requirements over the course of several months. Our meetings with Arm began in 2018 and continue to be held once every two months.

How FF-A works in Kinibi 600

As part of our future-looking development projects, we started an FF-A prototype in 2020, based on the initial work by the ARM architecture team on Fixed Virtual Platforms [FVP].

In 2021, we ported our prototype to the HiKey960 board, and reworked the memory management to support the Hafnium Hypervisor on ARM’s Tag Check Override [TCO] platform.

At the end of that year, we shared the first FF-A versions with two of our silicon customers.

Then, in 2022, we enabled our customers to validate the most common TEE use cases, such as Keymaster, text-based user interfaces [TUI] – Pay-by-Phone – and Widevine’s digital rights management [DRM] system.

To achieve this, we integrated changes for FF-A v.1.1. and also added application programming interfaces [APIs] required for DRM and TUI, such as drApiMapFfaBuffer(). Now, in 2023, we have prototyped FF-A on QEMU, and contributed to the TF-A open-source project.

Customers that are considering using FF-A need to consider several important aspects:

Support of FF-A in the Linux kernel or whatever the Operating System [OS] of choice may be. Trustonic can provide support to help integrate FF-A into the OS.
Does the Hypervisor support FF-A?
Latest TF-A firmware should be used to get FF-A support in EL3.
Customers can use Hafnium or their own Hypervisor for the Secure World. Kinibi can be used in various configurations in a setup with or without FF-A.

Today, FF-A is pushed mostly by Google for Android-based devices. However, we also have other partners that are positive in respect to implementing support for FF-A.

It is worth noting that FF-A is universal and can be used on Armv9-A, and on Armv9 platforms – with or without a Hypervisor in the Secure World. This allows for a smooth upgrade path, changing the existing Board Support Package [BSP] towards FF-A. without including the Hypervisor as an initial step, and adding it in a second step.

With regards to FF-A, virtual machines in the Secure World are called secure partitions. The Hypervisor in the Secure World is called the secure partition manager.

On the Armv9-A architecture, the Hypervisor is divided into two: the SPM core [SPMC] running in SEL2, and the SPM dispatcher [SPMD] running in EL3.

On the Armv8 architecture, the SPMC and the SPMD both run in EL3. Kinibi now also supports the Device Tree format with FF-A, as required by the TF-A implementation.

Some of the biggest technology companies are looking into FF-A, and making it support a requirement for their upcoming products.

Trustonic is ideally placed to help OEMs and silicon providers to achieve this by using Kinibi 600, which comes with FF-A support as standard. Kinibi 600 is available for evaluation by silicon providers, and also to original equipment manufacturers [OEMs] via integrations with some of our existing silicon partner implementations.

Summary

Armv9-A is the next step on Arm’s processor evolution. As a security company, we are particularly excited by MTE. While its primary role is the early detection of errors within programs, in practice this adds an extra layer of security, as attackers can no longer exploit these errors.

MTE makes the attackers life harder, and therefore makes our customers’ systems more secure.

Armv9-A: How our Kinibi 600 Trusted OS utilises MTE and FF-A features to create state-of-the-art TEEs

What is MTE?

How Trustonic has used MTE in Kinibi 600

1. Allocation of two buffers

2. Buffer overflow

3. Use after free

What is FF-A?

How FF-A works in Kinibi 600

Summary

Get in touch

Contact us to find out more

Device locking

Secure OS