Vincent Zimmer's blog: February 2013

Sunday, February 24, 2013

Anniversary Day.Next, Arch P*'s, and some stack history

Today makes 16 years at Intel. Sunday is usually my catch-up day for work tasks or extra credit work (e.g., patent drafting), but given that I posted an entry on this day last year, I'll steal a few minutes to post something today. If I keep my wits about me maybe this can become a tradition? Induction will argue that it might be so if I hit the 'publish' button before the end of the day. 0, 1, ... infinity, right? OK. Enough of that. Popper and Hume might rise from their graves and smite me for induction invective if I go on that path.

The first topic I wanted to touch upon today includes the intent behind the architectural p*'s in the UEFI Platform Initialization (PI) Specifications. Namely the architectural PEIM-to-PEIM interfaces (PPI's) and Architectural Protocols (AP's) in Volumes 1 and 2, respectively. I have been meaning to cover this topic for a while, but today's posting is motivated by James B's foils @ http://blog.hansenpartnership.com/wp-uploads/2013/02/UEFI-Secure-Boot-2013.pdf, namely slide 15. In this presentation deck, not 'paper' as some people are wont to describe foils these days, James discusses overloading the security protocol from the PI specification from his UEFI application. As James' loader is a pure UEFI application, any dependency upon an underlying PI interface breaks portability. There is no reason that UEFI interfaces need be built upon a PI-based underlying implementation, for example.

We mention this in many places, including page 12 of UEFI_Networking_and_Pre-OS_Security or http://noggin.intel.com/technology-journal/2011/151/uefi-today-bootstrapping-continuum, viz., "PI, on the other hand, should be largely opaque to the pre-OS boot devices, operating systems, and their loaders since it covers many software aspects of platform construction that are irrelevant to those consumers."

The history of the PPI's and AP's in Intel Framework, and subsequently the UEFI PI specifications, was to abstract a portable PEI and DXE core from the respective platform. The sources for the cores are intended to be kept platform neutral such that they can be seamlessly compiled to any target architecture, which today includes IA32, x64, Itanium, and 32-bit ARM for the edk2 at http://sourceforge.net/apps/mediawiki/tianocore/index.php?title=EDK2. As such, the arch PPI's and AP's provide a hardware or platform abstraction lay (H/PAL) to the cores. Back to James' foils above, he took advantage of a design choice where the protocol's were left in the UEFI protocol database after the DXE core had bound itself to those call-points. A conformant PI implementation of UEFI could have uninstalled that protocols prior to entering BDS and invoking 3rd party UEFI drivers and applications, for example.

Omission of the AP's from the database and precluding usage by 3rd parties is not just hygiene. Abuse of the timer AP, for example, by an errant or malevolent UEFI application could change the timer tick in a way that was not consistent with the DXE core's original programming of the time base via the timer AP.

So the lesson from above includes (1) if you want to be a portable UEFI driver and application, don't invoke UEFI PI AP's, and (2) as a platform creator, be wary of the protocols you leave installed in the database prior to invoking 3rd party UEFI content.

Next on the list today includes a short history of the UEFI networking stack. The original EFI networking stack was part of the EFI1.02 specification back in 1999. We modeled the network stack on the PC/AT networking support, including the PXE base code (BC) and UNDI interfaces. The latter 2 API's were defined in the PXE2.1 specification. The PXE (pre-boot execution environment) specification contains both a wire protocol for network boot and a set of API's that a network boot program (NBP) uses so that when it is downloaded onto a PC/AT BIOS client machine it can continue usage of the client machine's network stack. For EFI1.02, we adapted the PXE BC and UNDI API's into EFI equivalents, while preserving the network wire protocol aspects, such as the DHCP control channel, TFTP-based file download, etc. EFI added an additional software API on top of UNDI called the Simple Network Protocol (SNP), too. So from BIOS to EFI we preserved PXE BC, UNDI, and added a SNP between the former two.

After EFI1.10 and its inclusion of EBC and the EFI Driver Model, we thought of what a "EFI1.2" specification might entail. One aspect of the pre-OS we explored included more efficient networks. The problem we discovered was that SNP could only be opened by one agent, namely the PXE BC. When other network drivers were added, such as the circa-1999 BSD TCP/IP port to EFI, we had to unload the PXE BC in order to have the TCP/IP stack active. Against this background of a single SNP consumer, we designed the Managed Network Protocol (MNP). MNP provides for multiple listeners and also provides a non-blocking interface, unlike the single consumer, blocking nature of SNP and UNDI. We took the concept of multiple consumers and non-blocking for the higher layer protocols, including UDP, TFTP, DHCP, and TCP/IP. In doing so, we were able to rewrite the PXE BC application to be a small consumer of the below-listed services versus the monolithic stack embedded in the original PXE BC.

Since EFI1.2 never came to light, MNP and the rest of the modular stack definition were contributed into the UEFI2.0 specification as part of the UEFI Forum's effort. This was timely in that ISCSI was added to the UEFI2.0 corpus, and implementing ISCSI and PXE on the same modular stack proved to be feasible. The next question posed to the pre-OS networking stack was what to do with PXE. Recall that the EFI, and now UEFI, interfaces to PXE included the PXE 2.1 wire protocol. We were faced with evolving PXE to an alternate wire protocol or augmenting PXE and the rest of the networking stack to meet the emergent needs in the industry, such as internet protocol version 6 (IPV6). Given the extant infrastructure of PXE servers and associated scenarios, included blended scenarios such as 'PXE boot the NBP OS installer, mount an ISCSI share, install into the ISCSI block device', etc, we opted to draft PXE into IPV6 with parallel capabilities. We coined the effort to have network boot on IPV6 'netboot6' and codified the elements of this capability in RFC 5970 http://tools.ietf.org/pdf/rfc5970.pdf and chapter 21 of the UEFI 2.3.1c specification. 5970 was evolved as part of the DHC WG in the IETF and describes a broad class of network boot transports described by a URL, which includes HTTP, TFTP, ISCSI, NFS, etc. Netboot6 opted for the TFTP-based transport to have a parallel flow to the PXE 2.1 wire protocol, but there is no reason going forward that alternate boot transports can be used.

Seeing is believing, though, so you can find the networking infrastructure mentioned above at http://edk2.svn.sourceforge.net/viewvc/edk2/trunk/edk2/NetworkPkg/.

I regret that this blog and many of the white papers referenced in the past have not treated the UEFI networking stack with much detail. I'm working to get some more collateral out in the open, especially given how to compose pre-OS networking and fast-boot. One piece of wisdom that I want to mention is that the UEFI system should not attempt to start any of the networking stack elements unless there is a boot target that includes PXE or other networking-facing service. This is in the spirit of the UEFI driver model where you should not connect a driver unless needed to boot. This sparse connect is how UEFI achieves its fast boot times relative to PC/AT BIOS's that had to spin up all of their spindles and network interfaces in absence of having a stylized boot object like the UEFI boot targets with their device paths. Given that IEEE 802.3 Ethernet devices can take 7 to 20 seconds to effect cable detect actions, any touch to the network will kill chances of achieving the 2-3 second boot times expected by modern operating systems.

Other advice I will include on network boot includes policy decisions of orchestrating IPV4 versus IPV6 on network accesses, etc. So beyond having a more detailed overview of the stack, fast boot and network selection policy, let me know if there are additional topics on the network stack that you would like to see treated.

Well, that's it for anniversary year 16 blog. Let's hope that the inductive argument holds true and I am creating the year 17 variant of this blog.

Cheers

Sunday, February 10, 2013

What is 'Compatibility'?

So why another blog just a week after the last one? Given the aperiodic nature of my earlier blogs, this frequency seems a bit out of character. Well, I finished reading Fowler's Passionate Programmer and he exhorts the reader to practice, practice, practice. Not just code, but writing. And one venue for writing is a web log. No referees, reviewers, publisher, etc between my typing and publication. So here you go, for better or worse.

Now back to the topic of this blog.

There is often talk of 'compatibility' in the computer industry. From the perspective of the platform firmware, we define compatibility as 'the software you choose to run.' In this taxonomy, compatibility is not some absolute term or Platonic truth living outside the world but instead very much a contextual term.

So this is all well and good, but how does this relate to the platform firmware specifically? In the case of the boot firmware, we think of compatibility in terms of the PC/AT BIOS and its interfaces. These interfaces include the 16-bit int-callable interfaces published by the system board BIOS and BIOS extensions for use by booted software. The 'booted software' can include the MBR first stage loader and subsequent system software. The BIOS industry referred to these int-callable interfaces, such as int10h for video, int13h for disk, int16h for keyboard, along with data structures like the BIOS Data Area (BDA) and Extended BIOS Data Area (EBDA) as the "BIOS Runtime." This BIOS Runtime is distinct from the code that executes after a platform restart and the germination of the BIOS runtime interfaces. This phase of execution has been historically lumped under the term Power-On Self Test (POST).

Back in the early 1999 when the Extensible Firmware Interface (EFI) was first being deployed, this phase of execution, or the Boot Services (BS) and subsequent Run Time (RT), were really akin to the BIOS Runtime. This parallel is borne of the fact that the EFI specification, nor its successor the UEFI specification, dictates 'how' these environments appear. The pre-EFI/UEFI phase is opaque to and distinct from the UEFI BS and RT, just as POST is distinct from the BIOS Runtime and its int-calls.

In the case of EFI, the Intel Platform Innovation Framework (aka "Framework") and its successor, the UEFI Platform Initialization (PI) specifications, defined a stylized POST phase. This includes the SEC, PEI, and DXE phases of execution. But just as EFI is distinct from its preceding phase, the PI phase does not necessarily have to create a full UEFI environment. And it is this PI versus UEFI where compatibility comes into play. You see, as EFI and UEFI worked to gain acceptance, the bulk of platforms in the market since 1982 supported PC/AT BIOS and 16-bit BIOS option ROMs and OS loaders.

To leverage that large class of 'legacy' BIOS infrastructure, the Compatibility Support Module (CSM) was borne. The CSM was developed by a project that preceded the "Tiano" program but was codified into a Framework Specification http://www.intel.com/content/www/us/en/architecture-and-technology/unified-extensible-firmware-interface/efi-compatibility-support-module-specification-v097.html. This earlier program referred to the compatibility code as the 'fur ball' with the hope that one day a native mode firmware stack would prevail and 'cough up' the compatibility codes.

The CSM provides a way to encapsulate a BIOS runtime inside a Framework or PI-style firmware. The CSM specification abstracts the information collected during a PI-style POST, namely SEC, PEI, and DXE, and passes that information into the CSM wrapped version of a BIOS 16-bit runtime. We often refer to this binary in our edk2-based trees as a CSM16 binary.

Recall that UEFI provides services for blocks such as the EFI Block I/O protocol, and the BIOS supports similar capabilities via the int13h API. To leverage an underlying 3rd party storage option ROM's 16-bit disk services, a UEFI driver can be generically written to thunk or call down into the 16-bit code. The thunk is provided as a general utility via a API member of the CSM protocol. There is also a reverse thunk, but it is rarely if ever used. You see, a CSM16 cannot practically call back into native UEFI 32-bit or 64-bit code because when control is passed to a 16-bit MBR loader, the int15h E820h memory map that the legacy OS receives has the UEFI boot service pages mapped as available. And since there is no equivalent of ExitBootServices() for a legacy OS boot, UEFI services in the form of boot services drivers cannot reliably be invoked via a reverse thunk after the int19h IPL.

EFI/UEFI calling down into PC/AT BIOS CSM16, on the other hand, can be done throughout the boot services phase since the CSM16 services are available until ExitBootServices, so a BiosBlockIo or BiosVideo driver can reliably thunk to the int13h and int10h CSM16, respectively. The only way to have 32-bit or 64-bit native codes that are shareable between CSM16 and a DXE-based UEFI BS is to put the codes into something like a DXE SMM driver. The downside of the latter is that 3rd party UEFI drivers cannot be loaded into SMRAM. This only works for system board hardware, such as USB ports in the chipset.

And now even within EFI/UEFI there are compatibility concerns. EFI1.02 to EFI1.10 introduced OpenProtocol, and implementations of the EFI1.02 HandleProtocol had to use the later reference-counted API to implement the earlier non-reference counted API. Then UEFI2.0 in 2006 deprecated UGA and DeviceIo. From UEFI2.0 to 2.1, then 2.2, 2.3, 2.3.1 have introduced new services. Framework HII to UEFI HII. A style of programming to maintain compatibility often includes reading the specification version in the system table in order to take the appropriate code path. The same holds true for PI. PI became with the donated Intel Framework specifications, and from Framework 0.9x to PI 1.0, 1.1, 1.2, 1.2.1, 1.2.1a, etc there have been similar concerns with SEC, PEI, DXE, SMM.

So that's my blog for today on compatibility.

Saturday, February 2, 2013

32 versus 64 bit and measuring UEFI Secure Boot

This blog begins with a discussion of the execution mode of the pre-operating system (OS) and the operating system run time (RT). This is a question that comes up often, so I wanted to given some overview and history.

32v64 - The OS view of the world
We'll being with the OS-visible portion of the pre-OS, or the UEFI boot services (BS) environment. For today's 64-bit (aka x64) operating systems, you need a 64-bit kernel to be booted from 64-bit UEFI firmware. Or BS and RT execution modes need to match the kernel execution mode. Similarly, a 32-bit OS kernel needs a 32-bit RT and 32-bit BS. For example, x64 (aka x86-64, EM64T, AMD64, Intel64....) Ubuntu or Windows8 need to have x64 UEFI.

On this point UEFI differs from 16-bit PC/AT BIOS where you can boot 16, 32 or 64-bit kernels. In 1982, BIOS was 16-bit, as was Microsoft DOS. The I/O subsystem of DOS was the BIOS, so the DOS run time calling into the 16-bit BIOS int-calls worked out well. DOS was single-threaded, BIOS calls are blocking, and everyone was happy. This worked through Windows3.1 where DOS was still effectively the kernel. The tension in this model appeared with 32-bit OS's where the kernel would 'thunk' or make down-calls into BIOS when necessary, alongside the 32-bit native driver model with VxD's. With NT and beyond, though, the kernel would have native drivers such that the BIOS was only used for booting. Same story for Linux with its pure native drivers.

And on the point of booting, the first stage OS loader would typically stay in 16-bit mode in order to read the kernel from disk or network using 16-bit BIOS calls. Then the kernel would trampoline into 32-bit mode in order to run the native OS kernel. When x64 came on the scene in the mid-2000's, this trampoline process was extended to go from 16-to-32-to-64 bit mode.

The notable point on PC/AT BIOS and OS run times today is that modern OS kernels do not invoke 16-bit services after the initial loader. This is distinct from EFI/UEFI with its 'runtime' (RT) services which are by-design intended for OS kernel invocation.

Specifically, UEFI, on the other hand, took a different path. The original EFI 1.02 specification in 1999 had an IA32 (aka x86) binding that was 32-bit only. EFI antedated the relase of x64. EFI1.02 continued having only the IA32 binding. When x64 became public, it was still prior to 2006 and the standardization of EFI as UEFI. Intel owned the EFI specification at the time and pondered how to address support of x64. One side of the EFI house advocated today's behavior where the firmware Instruction Set Architecture (ISA) mode == kernel ISA mode, whereas a smaller group was in the camp of a modal solution.

Enter the '2-headed firmware.' The idea of the firmware having two modes or 'heads' was to forever keep EFI as 32-bit and having the ability for a 32-bit or 64-bit set of run time services to be published from the firmware for the kernel (32 or 64) to call.'

The requirement for the kernel to match the firmware begins with the ISA mode of the EFI RT. To support run time services, the kernel maps the EFI run time and directly calls into it from ring 0. Since the EFI run time is defined to be the same as the EFI boot services in the EFI system table, RT and BS must be the same. This leads to the model where if you want to boot a 64-bit OS, you need 64-bit firmware. Modern OS eschew 'thunking' or down-calling into 32-bit mode from a 64-bit kernel, unlike the practice of doing this from user mode (e.g., WOW64 in Windows and capability binary support in Linux).

So back to the 2-headed BIOS versus the pure 64-bit firmware. Both solutions were built, but it was quite tricky to implement the 64-bit RT. We ended up with building a shim that invoked an SMI so that the bulk of code shared between 64-bit and 32-bit RT was handled by the Framework SMM DXE code (i.e., SMM driver model prior to the UEFI PI). The 64-bit EFI was cleaner since the mode of EFI is also the mode of DXE, our EFI core. And with the UEFI 2.0 specification in 2006, the 64-bit work was contributed and became one of the central features of the industry-owned specification.

For a refresher on UEFI versus PI, check out Beyond BIOS http://www.amazon.com/Beyond-BIOS-Developing-Extensible-Interface/dp/1934053295/ or chapter 3 of Hardware Dependent Software http://www.amazon.com/Hardware-dependent-Software-Principles-Wolfgang-Ecker/dp/9048181283/.

After all of the that, you can see why ACPI and its static tables and interpreted AML byte-codes are the preferred run time interface to the platform for PC/AT BIOS and UEFI systems today. The co-location of the UEFI RT with the kernel in ring 0 and some of the vagaries listed above in implementation argue for not growing the corpus of UEFI RT capabilities.

32v64 - The platform construction view of the world
You will note above some discussion of DXE along with the EFI ISA mode work. Just as the UEFI RT dictates the UEFI BS mode of operation, the UEFI BS mode dictates the Driver Execution Environment (DXE) mode of operation since the Framework and UEFI Platform Initialization (PI) DXE forms the core of the EFI and UEFI interface sets, respectively. The same holds true for Framework and PI SMM in the DXE is the SMM loader, so the ISA mode of DXE boot service-time became the mode of the SMM DXE. Once the die was cast for 64-bits, almost everything in the pre-OS became 64-bit.

"Almost everything" was the case because there are a couple of modes of operation prior to DXE, namely the SEC and PEI phases. These two phases commence after a processor reset. Even with the advent of x64, Intel Architecture CPU's still commence operation in 16-bit mode. As such, the SEC phase mode switches to the PEI mode of operation. PEI can execute in 32-bit or 64-bit mode. Some of the complexities of 64-bit long mode entail larger binaries and having to run with paging enabled. These can be covered by budgeting for the flash space for larger code images and setting up ROM'd page tables with the AD bits pre-set so that page fault walkers don't panic trying to update read-only PTE's. The advantages of having the ISA mode of PEI and DXE match also include having codes from PEI that can be passed into DXE and re-used, such as the PE/COFF loader and the report status code logic. The latter is especially important in that prior to loading the DXE architectural protocols, a HOB proxied report status code pointer can be used to update the boot progress during the grey period between PEI hand-off and DXE start up. 32-bit mode, on the other hand, can run in physical mode without paging. Since such PEI execute in place (XIP) codes rarely need greater than 32-bit addressing and need to be penurious on code size and data usage because of cache-as-RAM size limitations, today PEI implementations are typically 32-bit and the final PEI, or the DXE IPL PEIM, will mode switch to 64-bit as part of invoking DXE Main.

So in summary, the ISA mode of the firmware and the OS kernel is not so simple of a story, and the story is again distinct from the ISA mode of the early PI phases of execution.

So enough on 32-bit and 64-bit.

Note (3/4/14): Matt Fleming has an experimental patch treating booting a 64-bit Linux kernel on a 32-bit UEFI platform https://lkml.org/lkml/2014/3/4/242

UEFI Secure Boot - Measuring Policy
The other thing I wanted to mention was a recent publication on MSDN, namely the Trusted Execution Environment (TrEE) EFI Protocol and measurement updates, which can be found at http://msdn.microsoft.com/en-us/library/windows/hardware/jj923068.aspx. This publication is important because the language toward the end around PCR[7] logging of the UEFI 2.3.1c PK, KEK, and DB/DBX addresses one of the architectural gaps of scenarios that employ both measured and secure boot. Recall that the paper "UEFI Networking and Pre-OS Security" at https://noggin.intel.com/content/uefi-networking-and-pre-os-security-0  describes the relationship of measured and secure boot. Specifically, on page 94 of the same, "...Measured Boot must include the Allowed, Forbidden, KEK, and PK variables (databases) in its measurements of a Secure Boot-configured platform." As a result, the measurement language in the TrEE protocol provides a solution for recording the state of the UEFI Secure Boot enforcement policy.