Saturday, August 1, 2015

EFI Byte Code

This short post will provide some history around the EFI Byte Code (EBC). There were some interesting questions around this at
BLR would like to thank for braving the heat and teaching us about the UEFI security model today. Thank you Vincent!
As such, I thought that I'd take both a trip down memory lane and provide some forward-looking thoughts, too.

The quick background is that EBC is intended to allow for writing a single binary UEFI driver that will work on a broad class of system boards supporting different native instruction set architectures. ARM interest in this capability was recently discussed at a UEFI event, for example.

To begin, EBC is a software virtual machine described in the UEFI 2.5 specification, with the Instruction Set Architecture (ISA) in Appendix J and the virtual machine architecture defined in chapter 21. The idea is that C coded UEFI drivers can be compiled into the EBC ISA and linked into a PE/COFF image with a subsystem type of IMAGE_SUBSYSTEM_EFI_BOOT_ SERVICE_DRIVER and Machine Type of IMAGE_FILE_MACHINE_EBC. This is distinct from native mode UEFI images that are typically IMAGE_FILE_MACHINE_AMD64/I386/IA64/ARMNT/ARM64 for the x64/IA32/Itanium/32-bit ARM/and Aarch64 native CPU bindings in the UEFI specification today (in chapter 2). The Microsoft Linker will provide these image types, but today only the Intel EBC compiler will generate the images.

The above description defines what the producer of the code needs to do, namely compile and link the driver C sources using the above tools. On the consumer side, the idea is that the EBC-formatted image will be located in the non-volatile storage (option ROM container) of a host-bus adapter card, such as the Peripheral Component Interconnect (PCI) The PCI specification designates types of option ROM's, such as UEFI, PC/AT, or FCode open firmware. The underlying UEFI firmware on the system board with the PCI slots can optionally contain the EBC interpreter, such as the DXE driver variant at The interpreter will be invoked after the image loader discovers, potentially authenticates via UEFI Secure Boot, and relocates the image into memory. One interesting thing that the interpreter has to do is create 'thunk' sections for instances of the EBC ISA calling the native ISA:

  // Create a thunk for EBC code. R7 points to a 32-bit (in a 64-bit slot)
  // "offset from self" pointer to the EBC entry point.
  // After we're done, *(UINT64 *)R7 will be the address of the new thunk.
  case 5:
    Offset            = (INT32) VmReadMem32 (VmPtr, (UINTN) VmPtr->Gpr[7]);
    U64EbcEntryPoint  = (UINT64) (VmPtr->Gpr[7] + Offset + 4);
    EbcEntryPoint     = (VOID *) (UINTN) U64EbcEntryPoint;

    // Now create a new thunk
    Status = EbcCreateThunks (VmPtr->ImageHandle, EbcEntryPoint, &Thunk, 0);
    if (EFI_ERROR (Status)) {
      return Status;

    // Finally replace the EBC entry point memory with the thunk address
    VmWriteMem64 (VmPtr, (UINTN) VmPtr->Gpr[7], (UINT64) (UINTN) Thunk);

Note:  Please no flames on the coding fragment. The CamelCase adheres to the coding standard more reminiscent of Windows Kernel Drivers that is distinct from the Unix-like Indian Hill found in Linux and coreboot. To me the particular coding style is less important than both having and enforcing one for a given project.

The EBC ISA is a very simple load/store architecture with a strongly ordered memory model. It is not intended for high performance as much as lending itself to a small, simple interpreter architecture in order to minimize code space in the system board flash, and it features a relatively concise encoding, which ends up being slightly larger than a IA32 CISC encoding and smaller than a Itanium VLIW style encoding. The ISA also does not carry type encoding information like the Java Virtual Machine Language (JVML) since the EBC ISA is intended to be transformed from the type weak C language, versus the JVML supporting the type-safe Java Source Language.

What do they say about programming in C code?  It's like smoking a cigarette in a swimming pool full of gasoline. OK, enough of the jokes and back to the blog.

What the EBC ISA supports, though, is a concept from the UEFI Specification and the associated C sources of the 'natural integer.' The unsigned and signed natural integer, or UINTN/INTN, is the size of the VOID* of a particular native ISA. So if an EBC image is executing on a 64-bit x64 machine, sizeof(UINTN) is 8, namely 8 bytes, or 8*8 = 64 bits. Similarly, the EBC image on an IA32 machine has a sizeof (UINTN) as 4. Since the target UEFI driver's C code is compiled into a synthetic EBC ISA stream for the IMAGE_FILE_MACHINE_EBC path, the actual 'sizeof'' operation in the UEFI driver becomes a service that yields 4 on 32-bit machines and 8 on 64-bit machines. This support of naturally sized integers is something that most C compiler back ends do not support, thus the reported difficulty in creating a GCC back-end for EBC ISA, as reported by some compiler experts back in the early 2000's when EBC ISA was invented to support the EFI 1.10 driver model. At the time, the challenge was to support the two native ISA encodings in the EFI 1.10 specification - IA32 and Itanium - whith a goal to have single UEFI drivers from the Independent Hardware Vendor (IHV) Host Bus Adapter (HBA) PCI boards that would work on client IA32 and server Itanium systems.

Fast forward to 2015 from the circa 2001 creation of EBC and the EFI Driver model.

First let's talk about good ideas that didn't come to pass in the market. One idea was to have a EFI VM in the OS that could employ the EFI EBC drivers during operating system runtime. One use-case was the brown-out during OS initialization after Exit boot services and prior to loading the OS performance display driver. The usage therein included using the EFI 1.10 UGA driver (the predecessor to today's GOP) built as an EBC image to display characters while executing within an OS "EFI VM" environment. An "EFI VM" is emulating a UEFI boot services environment after Exit Boot Services; a popular instance thereof is the OVMF emulation of a BS environment for purposes of providing guest firmware in a virtual machine monitor, like VirtualBox This idea was demonstrated on Linux, but ultimately concerns about 'yet another byte-code interpreter in the kernel' led to today's GOP with the framebuffer exposed by the pre-ExitBootServices environment that the OS early flow could just peek/poke directly.

Again, a good idea in theory, just like the original provision for the EFI 1.02 UNDI to be usable by the OS runtime in 'safe mode' usages. The UNDI has the ability for the OS to replace the memory and I/O services with the OS correlatives. This usage was eschewed by OS community even more viscerally in the last 16 years given the need to run 'native' pre-OS EFI driver code in the OS content, although it was demonstrated on how to build an "UNDI Class Driver" for OS's like Linux. The value of using the platform UEFI driver post-boot is that old OS boot media without an in-box OS driver can rely upon the system board, which typically has up-to-date firmware driver support for networking, as a simple path to reach an online resource to download the OS's performance driver.

So how important are PCI drivers on 2015 systems?
Today, most client machines do not have exposed PCI slots, but there is now a rich set of 64-bit class servers that are still consumers of enterprise storage and networking HBA's. High-end desktops still support external graphics cards, too.

Given the slot-based nature of EBC driver usage, the EBC support was made optional after EFI1.10 was contributed to the UEFI Forum and became part of the UEFI2.0 corpus. The primary motivation at the time was the embedded community with their often penurious flash budgets and no need for 3rd party driver interoperability of plug in cards.

On the security front, the pre-OS has become much more hostile in the following 1.5 decades. As such, another nice aspect of an interpreted byte stream is the ability to sandbox the code, as mentioned on pages 11-12 of and Recall that even if your driver passes cryptographic verification, you only have some guarantee of authorship/provenance, not correct behavior. Malevolent or errant coding may still exist, thus the value of some defense in depth like the above-listed isolation techniques.

There has also been a renaissance in the compiler community with LLVM and CLang There is interesting security work ongoing with LLVM, such as KLEE on the producer side of the C code.  Given KLEE and the expansion of open source efforts on may additionally drive some interest in an open source C compiler will take up the challenge of supporting EBC? Time will only tell.

No comments: