Vincent Zimmer's blog: DRTM

Showing posts with label DRTM. Show all posts

Sunday, February 12, 2023

Blue Hat 2023 and UEFI Secure Boot

“Victory has 100 fathers and defeat is an orphan.” This quote is attributed to John F. Kennedy in the wake of the Bay of Pigs but I believe a variant of this has been passed down over time. I was thinking of this quote in the context of UEFI Secure Boot this week at Blue Hat https://www.microsoft.com/bluehat/ at Microsoft building 92. Visiting the Microsoft campus reminded of me the many hours engaged with Microsoft during the build up to this feature in the mid-2000’s (including hours in the 40's buildings that were razed recently for the new MS campus).

The above talk from Eclypsium (now posted https://github.com/n0x08/ConferenceTalks/blob/master/0-Day_FirmWarez_BlueHat2023.pdf) this last week also reminded me of how UEFI secure boot and firmware security is now part of the tech lexicon.

But back to the history discussion from the early 2000's. At that time, the industry was recovering from the lack of adoption of NGSCB https://en.wikipedia.org/wiki/Next-Generation_Secure_Computing_Base. Intel LaGrande Technology (LT),

which is now Trusted Execution Technology, along w/ AMD Pacifica/Presidio, were features to add ‘late launch,’ or ‘dynamic root of trust for measurement’ (DRTM). The DRTM facility provided an RTM in the CPU, and the roots of trust for storage (RTS) and recording (RTR) were delegated to a ‘then’ off-chip element called the Trusted Platform Module (TPM) https://trustedcomputinggroup.org/work-groups/trusted-platform-module/.

Since NGSCB, with its “Trusted Applet” (TA) architecture ,was a mile too far both for privacy and application compatibility, LT and Pacifica didn’t get embraced by Windows. NGSCB entailed the need to refactor applications to put security sensitive codes into a memory-only execution regime. Although in 2023 folks may say ‘so what, SGX did it,’ recall this is 2002. So folks like Peter Bidell managed to build a full-disk encryption (FDE) feature around the TPM and a BIOS-based root-of-trust for measurement (RTM), called the Static RTM (SRTM).

This is where I entered the scene. I worked with folks like Mark Williams to author and enable the EFI TPM and platform specifications. The former entailed the API exposed to UEFI drivers and OS loaders/applications, and the latter defined what type of code and data objects to record or ‘measure’ in the TPM’s Platform Configuration Registers (PCRs). More information on the various RT*’s and the TPM can be found in https://www.intel.com/content/www/us/en/content-details/671466/trusted-platforms-uefi-pi-and-tcg-based-firmware.html.

While doing this work with Microsoft, I learned that recording the hash of the Portable Executable/COFF (PE/COFF) image was not so simple. It entailed hashing various portions of the binary and omitting others. It turns out this hash or ‘digest’ is a critical element for code integrity (i.e., what became documented as the Authenticode hashing algorithm). At the same time Microsoft was reeling from no DRTM, it moved to static verification with the invention in Vista of Code Integrity, or (CI). CI entailed having the OS loader cryptographically verify the digital signature of the early boot and kernel components in Windows, thus having something of a static root of trust for verification (RTV), with the ‘root’ being the OS loader.

I learned about how all of this CI worked after reading Matthew Conover’s paper https://github.com/tpn/pdfs/blob/master/Assessment%20of%20Windows%20Vista%20Kernel-Mode%20Security%20-%20Matthew%20Conover%20(Symantec).pdf.

This feature is also described in https://csrc.nist.gov/csrc/media/projects/cryptographic-module-validation-program/documents/security-policies/140sp1327.pdf. Learning about how CI worked and the top-of-mind reminders from folks like Heasman

https://www.blackhat.com/presentations/bh-usa-07/Heasman/Presentation/bh-usa-07-heasman.pdf

led me to suggest adding an ‘SRTV’ to the firmware to complement the existing ‘SRTM’ we had been building. I roughed out some of those thoughts in the 2007 paper

https://github.com/vincentjzimmer/Documents/blob/master/SAM4542.pdf https://dblp.uni-trier.de/rec/conf/csreaSAM/Zimmer07.html?view=bibtex mentioned in earlier post

http://vzimmer.blogspot.com/2022/08/pqc.html, too.

In other words, leaving the OS loader hanging in space to start the verification chain seemed fragile, so having the underlying firmware act as a static RTV for the loader would strengthen this static verification chain. It was also complemented by the SRTM since the TPM could provide evidence or an audit log of the verification actions.

Given the difficulty in crafting pre-OS malware w/ MBR's for PC/AT boot, the lack of interoperable code integrity in UEFI would have made things so painful as I noted in http://vzimmer.blogspot.com/2012/09/late-september-mumbling.html. Also, extending CI into the host firmware felt as natural as extending the OS filesystem via the EFI System Partition (ESP) into the firmware for purposes of interoperability.

One of the challenges posed during this work was to have a reasoned approach to the infrastructure. That’s where Varugis (former NGSCB TA architect turned Windows CI architect) and I tried to create an integrity model of the pre-OS https://github.com/vincentjzimmer/Documents/blob/master/integrity-protection-analysis-of-OS-preboot.pdf. Varugis would often rail against folks who said the BitLocker was for code-integrity since it was largely a data integrity feature. He wanted to ensure that any functionality built into the ecosystem had a sound foundation. To that end the platform was decomposed into a series of Clark-Wilson compartments (although the CDI’s of CW and reuse of acronym of the same in TCG DICE is amusing). The OEM compartment, extensible pre-OS UEFI compartment, and finally, the OS compartment had rules, including guards and access controls.

It turns out that since CI was just for internal use by Microsoft tooling, the Authenticode hashing algorithm and signature section of the PE/COFF were not documented. Just as UEFI needs MS FAT and PE/COFF itself for purposes of interoperability, this information was added to the public document from Microsoft to support this capability

https://download.microsoft.com/download/9/c/5/9c5b2167-8017-4bae-9fde-d599bac8184a/authenticode_pe.docx, as were the required infrastructure elements in the 2.1+ specifications. The variant of UEFI Secure Boot that ultimately shipped as a required capability in Windows 8, along with UEFI measured boot, was the specification version 2.3.1 http://www.uefi.org/sites/default/files/resources/UEFI_Spec_2_3_1.pdf.

Another interesting tie-in to last week's Bluehat was the keynote from Mark Russinovich. Mark was one of the people to whom we presented this UEFI Secure boot work back in the day, too. Others in the audience at the included MS security arch, and former Intel colleague, Carl Ellison.

BTW
ECR48 was remastered in M279 https://uefi.org/specs/UEFI/2.10/Frontmatter/Revision_History.html

Mark's keynote at Bluehat covered confidential computing, large language models (LLMs) for security, open source firmware foundation, safe languages, and the software supply chain.

On safe languages

I was reminded of https://cfp.osfc.io/media/osfc2020/submissions/SLFJTN/resources/OSFC2020_Rust_EFI_Yao_Zimmer_NDK4Dme.pdf, https://uefi.org/sites/default/files/resources/Enabling%20RUST%20for%20UEFI%20Firmware_8.19.2020.pdf, and https://link.springer.com/chapter/10.1007/978-1-4842-6106-4_20.

And for the challenges of the open source supply chain, and software supply chains in general, I commiserated with

These slides reminded me of the long-used

from https://embeddedcomputing.com/technology/security/software-security/understanding-uefi-firmware-update-and-its-vital-role-in-keeping-computing-systems-secure, too.

OK. Back to the narrative. By this time the UEFI secure boot capability had iterated through early discussions and into the standards incubation mix. The final clean-up of how authenticated variables work, including the append option for servicing, were introduced by Magnus. Magnus Nystrom was the security lead in Windows core OS and managed to get the feature finally into the OS. We described some of this journey in https://www.intel.com/content/dam/www/public/us/en/documents/research/2011-vol15-iss-1-intel-technology-journal.pdf.

One of the challenges of this work was how to apply trust for the more open PC architecture. Cell-phones and other appliances had locked-down bootloaders, but with the PC and its ability multi-boot (think reason for UEFI existence) and adapter cards, the trust needed to be more interoperable. This is where the concept of the various stakeholders, through the OEM with the PK, the OS vendors and ISV’s with the KEK’s, and the hash or verification certificates in the allowed DB and disallowed DBX were created.

Tim Lewis https://uefi.blogspot.com/ championed a lot of the details on this design through the forum, too. Tim also led the UEFI Security Subteam (USST) in those first heady years.

A spate of additional publications were created to describe this capability, including https://github.com/tianocore-docs/Docs/blob/master/White_Papers/A_Tour_Beyond_BIOS_into_UEFI_Secure_Boot_White_Paper.pdf created in the wake of my first ToorCamp talk, to https://www.intel.com/content/www/us/en/content-details/671464/a-tour-beyond-bios-with-the-uefi-tpm2-support-in-edk-ii.html and https://www.intel.com/content/www/us/en/content-details/671120/a-tour-beyond-bios-uefi-authenticated-variables-in-smm-with-edk-ii.html on how to use the code https://github.com/tianocore/edk2/tree/master/SecurityPkg. Given the long arc of this work, https://link.springer.com/book/10.1007/978-1-4842-6106-4 was created to collect this background in one location.

The evolution of UEFI secure boot wasn't without some controversy, especially from privacy groups and others who worried about lock-down of the PC https://www.csoonline.com/article/2221385/geeks-under-fire--war-on-privacy--freedom-and-general-computation.html, but Microsoft logo requirements like a physically-present user control for this feature, along with the UEFI CA signing Linux shim's, helped calm those concerns - some of the great work w/ Linux in this space was described in https://www.intel.com/content/dam/develop/external/us/en/documents/sf13-stts002-100p-820238.pdf, too, although it was always smooth sailing as I recall people glowering at me when I presented https://github.com/vincentjzimmer/Documents/blob/master/PLUG-UEFI-001.pdf in Portland. One of the MS PM's even told me that Sinofsky considered pulling all of the UEFI features from Windows 8 if the issue were not resolved in a timely fashion which makes sense given that wide-scale UEFI deployment without integrity controls would have led to unbounded pre-OS malware concerns for users.

Work continues apace in this area, from investigations into post quantum cryptography https://eprint.iacr.org/2021/041.pdf https://uefi.org/sites/default/files/resources/Post%20Quantum%20Webinar.pdf to better revocation models https://github.com/rhboot/shim/blob/main/SBAT.md. And DRTM is getting great traction in the platform now https://cdrdv2-public.intel.com/756963/DRTM-based-computing_whitepaper_FINAL_MAY2021.pdf, although contemporary with Windows 8 we worked hard to see if we could have a standards-based solution https://trustedcomputinggroup.org/work-groups/trusted-platform-module/, too. DRTM definitely reduces the trusted computing base (TCB),

thus why the DRTM and optional DRTV were mentioned in https://www.intel.com/content/www/us/en/content-details/671466/trusted-platforms-uefi-pi-and-tcg-based-firmware.html, too.

And some of the ecosystem challenges of distributed trust, as shown in the 'figure 5' above with the various trust anchors for UEFI secure boot (including the UEFI CA and revocation lists like the dbx https://uefi.org/revocationlistfile), don't magically disappear. I was reminded of Mark Twain's 'History never repeats itself, but it does often rhymes' recently during a presentation in the Open Compute Project Security team https://www.opencompute.org/wiki/Security discussion on January 31 https://docs.google.com/document/d/1VVMUzYESZNuyT1_YJlQSdSKBy-5t1otJIyXTbXuOoX4/edit# regarding a CA for the device firmware for usages like SPDM. Some of those melodies included:

"If there is an explosion in the number of CAs how will verifier vet the trustworthiness of the CAs?
..
Want to see a manageable number of CAs under a common policy
..
Trust stores in components will have large number of anchors and have reoccuring updates
..
Devices without internet connectivity cannot retrieve trust anchor info
..
Small vendors may have difficulty operating root CAs
..
Common PKI CA policy."

These are many of the issues UEFI has encountered and grappled with since the late 2000's.

So back to the quote at the top of the posting. The reason for many ‘fathers’ is often that for an idea to come to market the originator of the concept needs many hands to help shepherd it forward through the vagaries of business, development, validation, and ultimately shipping at scale. And as I updated Ubuntu on a home PC this weekend and noticed,

this stuff is still pretty cool.

A PhD colleague of mine recently decried the lack of authoritative knowledge, from blogs to arxiv. Luckily in this scurrilous post (aka 'blog') I haven't claimed any refereed, authoritative discourse. These are more the late weekend musings motivated by wonderful arc of interactions with colleagues, many of who have changed companies, retired, or passed away.

Lest that sound unhappy, though, I did catch up with David and crew over Teriyaki on Thursday. As I offered them some Bluehat swag

David let me know that he and the Microsoft team who implemented UEFI Secure Boot in Windows actually won a Microsoft-wide award that year, viz.,

Very nice (although the referenced link is dead).

Even in the world of cyber-everything, I do like the physical relics (including security faux currencies)

spanning journey's from

https://github.com/rrbranco/BlackHat2017

I guess this closes the circle for this blog. I opened with Nate Warfield prezo https://www.helpnetsecurity.com/2022/06/19/eclypsium-executive-team/ and he's part of https://eclypsium.com/company/ which is led by Yuriy and John, my co-presenters from the 2013 Cisco seccon talk.

Sunday, June 7, 2015

GUIDs, Revisions, Interrupts

This blog will cover a few topics, including GUIDs, Revisions, Hardware Interrupts, and Portable Libraries.

GUID versus Revision
To being, I was recently asked about how one should evolve a protocol interface. There are two ways to extend an interface, including: 1) have a revision field that designates if the service set or data has been extended, and 2) define a new protocol GUID.

The first technique will be familiar with some of the original EFI1.02 style API's, such as the EFI_BLOCK_IO_PROTOCOL and the various service tables, such as the EFI System Table, PEI Service Table in the PI specification, etc. For this technique, the service table or protocol can be extended in a back compatible fashion by appending new services to the end of the table while at the same time increasing the revision number.

This leads to a programming technique wherein the caller locates the protocol or service table and has to check the revision to see if the revision is greater than or equal to a number that matches the industry standard. An example of this technique in action includes the EFI_PEI_RESET2_SYSTEM in the recently published PI1.4 specification http://www.uefi.org/sites/default/files/resources/PI_1_4.zip. If a calling PEIM wants to use this service but also maintain portability across PI1.0 through PI1.3 conformant systems, the caller would only invoke the new PI1.4 service if PEI Service table revision greater than or equal to 1.40.

This first technique is required for the service tables, but for protocols the preferred extension method is to define a new GUID. Although the original EFI Protocols featured the revision field, and protocols like EFI_BLOCK_IO_PROTOCOL have been extended via the revision field, all other protocols have been evolved via new GUIDs. This can include the EFI_SIMPLE_TEXT_INPUT to the EFI_SIMPLE_TEXT_INPUT_EX change, but more often the growth is seen via appending a '2' to the original protocol, such as in EFI_LOAD_FILE2_PROTOCOL, EFI_DRIVER_DIAGNOSTICS2_PROTOCOL, EFI_COMPONENT_NAME2_PROTOCOL, EFI_FORM_BROWSER2_PROTOCOL, etc. These can be found in the UEFI 2.5 specification http://www.uefi.org/sites/default/files/resources/UEFI%202_5.pdf.

The nice thing about the second technique is that the caller doesn't have to do the cumbersome revision check, PI variants can include the proper name in the dependency expression, and API bug fixes can span all of the services. In addition, the producer of the protocol can produce the original and '2' variant quite easily by sharing implementations of the common services in a single driver.

Windows has done a similar evolution of its API's, although it often appends 'Ex' to designate the new API. Examples therein include IoConnectInterrupt https://msdn.microsoft.com/en-us/library/windows/hardware/ff548371(v=vs.85).aspx to IoConnectInterruptEx https://msdn.microsoft.com/en-us/library/windows/hardware/ff548378(v=vs.85).aspx.

Hardware Interrupts
Speaking of hardware interrupts, the question of hardware interrupt usage in UEFI has been brought up http://sourceforge.net/p/edk2/mailman/message/28764215/ many times since the roll out of EFI 1.02 in 1999.

It turns out hardware interrupts are used in UEFI, including at least the hardware timer tick. But as table 22 of the UEFI 2.5 specification notes, the system may choose to implement "firmware interrupts" between TPL_NOTIFY and TPL_HIGH_LEVEL.

The table says "This level is internal to the firmware."

This means that the firmware must adhere to the TPL mapping in the specification in order to maintain interoperability, viz.,

#define TPL_APPLICATION 4
#define TPL_CALLBACK 8
#define TPL_NOTIFY 16
#define TPL_HIGH_LEVEL 31

but nothing stops a given underlying UEFI implementation, such as one based upon PI DXE that also uses TPL's, to define

#define TPL_DEVICE_1 18
#define TPL_DEVICE_2 20
#define TPL_DEVICE_MAX 28

And if you look at the DXE core implementation, a TPL level is really just a linked list. So the way that a hardware interrupt protocol driver could be implemented would be to use the CPU architectural protocol to register interrupt service handlers (ISRs) with the SOC, using something like a programmable interrupt controller's priorities to map TPL_DEVICE_1 to lower priority devices and TPL_DEVICE_MAX to higher priority. Think low-speed consoles like a UART to the former and a high-speed networking device to the latter.

The implementation of the ISR would be similar to the top half http://www.makelinux.net/ldd3/chp-10-sect-4 of a Linux driver, namely just enough code to quiesce the device that triggered the interrupt and then signalling an event to invoke a lower TPL handler, or something like a bottom half of a Linux driver. Another technique is to invocation of a Deferred Procedure Call (DPC). The motivation for the DPC-like logic is to do most of the long-lived processing at a lower TPL than the interrupt in order to allow for other activity to be interleaved and to have the richer service set of a lower TPL, as shown in the Table 23 "TPL Restrictions" of the UEFI 2.5 specification.

In fact the EDK II implementation has a DPC implementation https://github.com/tianocore/edk2-MdeModulePkg/blob/master/Include/Protocol/Dpc.h. This is a useful API that may one day go to the UEFI specification, just like the useful PI interfaces of the LockBox https://github.com/tianocore/edk2-MdeModulePkg/blob/master/Include/Protocol/LockBox.h and Variable Lock Protocol https://github.com/tianocore/edk2/blob/master/MdeModulePkg/Include/Protocol/VariableLock.h would help interoperability by joining a future PI specification.

So we see that the plumbing is in place to have hardware device interrupts, but the reason that a UEFI specification conformant driver cannot depend upon hardware interrupts, beyond the implicit timer tick for the timed event services, is that the DEVICE TPL mapping above is not codified by the UEFI specification. Each vendor may provide different TPL mappings for the range between NOTIFY and HIGH_LEVEL. In addition, there is no API like IoConnectInterrupt which abstracts the use of TPL's and managing of the IO interrupt controller from a given UEFI implementation to a portable UEFI driver.

This doesn't stop a vendor who provides the full UEFI implementation, such as a EDK II-based DXE core and platform drivers, from providing this capability and a customer interrupt protocol, of course. You could also retrofit an existing polled UEFI driver to be modal, namely have the driver entry point looks for the platform interrupt protocol and register an ISR if it exists, and if it doesn't default to the existing polled behavior that is most broadly compatible.

And given that EFI has been shipping for the last 15 years with the present polled driver model, there is some question as to whether hardware device interrupts are necessary. For boot scenarios where the pre-OS is typically doing a single activity, such as accessing an I/O device, the polled model suffices. When there are performance concerns, the system designer can vary the system timer tick, sometimes going as low as 1ms for the timer period in order to service the device actions. Today, performance sensitive drivers like the Pxe basecode driver aggressively poll the underlying network API's in order to maintain line-rate.

The only cracks in the armor for this model appear when different I/O stacks interact. For example, if we are performing a network download into a memory buffer, only the networking stack is in play. But if the networking download interleaves packet transition with writes to a durable storage media, then the networking stack and file system storage stack compete for resources. This is an area where the polled model can observe system performance challenges.

As we enhance the scenarios with the recent UEFI HTTP API's and other capabilities, it will be fun to watch this space.

Even OpenFirmware 1275 didn't drive their network stack with the Forth FCode but instead it used native code NanoKernel http://www.physik.uni-regensburg.de/strongnet/documents/STRONGnet2010/schick1.pdf

Libaries
Speaking of spaces to watch, another area that interests me is portable libraries. Specifically, the MdePkg of EDK II has different library classes, including 'base'. The nice thing about a library of type base is that I can use the code in PEI pre-memory, PEI post-memory, DXE, DXE SMM, UEFI boot service, UEFI Runtime, and SEC. You can see from the last sentence that firmware programming in the UEFI PI world is pretty challenging. There are seven regimes to write code, and business and technical reasons sometimes dictate moving code from one area to the other, such as re-using the UEFI FAT driver to create a PI recovery PEIM to load a recovery FV from disk, or moving SI code from DXE to PEI for purposes of creating an Intel (R) Firmware Support Package (FSP) binary http://firmware.intel.com/sites/default/files/resources/A_Tour_Beyond_BIOS_Using_the_Intel_Firmware_Support_Package_Version_1_1_with_the_EFI_Developer_Kit_II.pdf.

Beyond the UEFI PI world, imagine having a routine that does error handling, such as the Reliability, Availability, and Serviceability flows that a server designer might want to migrate from SMM http://firmware.intel.com/sites/default/files/resources/A_Tour_beyond_BIOS_Implementing_APEI_with_UEFI_White_Paper.pdf to a system service processor/baseboard management controller (BMC). If these error management flows had their core logic implemented as libraries, then the movement from the host processor to the non-host processor environment would be much easier.

The business value is the fundamental logic in the C code, not the syntactic sugar around the code to make it a SMM driver or a service processor task in the RTOS/process in the OS.

TXT and UEFI Secure Boot & Measured Boot
We just left left off talking about adjacent technologies of UEFI PI and service processors. Another adjacent and quite complementary technology includes UEFI PI and Trusted Computing, including Intel(R) Trusted Execution Technology (TXT). I sometimes get asked about this so I thought that I'd spend a couple of moments on this topic I gave a quick overview of UEFI Secure Boot and Measured Boot using a Trusted Platform Module (TPM) on open hardware at http://firmware.intel.com/blog/security-technologies-and-minnowboard-max, but I omitted TXT since this open platform's Intel Baytrail CPU doesn't support those extensions. For hardware that does support TXT, such as Xeon class CPU's and client VPro, though, the relationship bears mentioning.

If you do the reference chasing on the latter link, though, you'll see that the UEFI Secure Boot and the Static Root of Trust for Measurement (SRTM) are a blended scenario, with the latter using the non-resettable, static PCR's 0..7 for the pre-OS, and PCR's 8..15 for the OS. This scenario can co-exist with TXT, such as in slide 10 https://01.org/sites/default/files/openstacksummit_vancouver_trusteddockercontainers.pdf where the "BIOS" here is the early PI code that loads SMM, and the latter BIOS with option ROM's falls under the purview of UEFI Secure Boot. T-Boot is a type of Measured Launch Environment (MLE) https://www.kernel.org/doc/Documentation/intel_txt.txt and using SENTER instruction will activate measurement into the resettable PCR's above PCR15. As such, it provides a Dynamic Root of Trust for Measurement (DRTM) alongside the SRTM. And for purposes of attestation, having more platform elements in the attestation vector provides a richer management experience.

In fact, Bill and I described blending of these various technologies on server class systems 2.5 years ago http://firmware.intel.com/sites/default/files/resources/Platform_Security_Review_Intel_Cisco_White_Paper.pdf. There is a also a TCG-defined API to abstract the DRTM launch, as described in http://www.trustedcomputinggroup.org/resources/drtm_architecture_specification, although few systems publish this interface today as far as I know.

In memory
Enough on GUIDs, libraries, and interrupts. I'd like to close this blog with a more personal thought. I had written a small message about my friend George Cox last June http://vzimmer.blogspot.com/2014_06_01_archive.html upon hearing about his retirement. Fast forward a year and I was saddened to see the message of his passing at https://twitter.com/fortnow/status/602872182579015681

‏@fortnow

George Cox, Intel Security Architect, passed away yesterday. 

George was a great friend, technologist, and mentor. The picture below shows George as I remember him best, teaching a technical concept and interacting with others.

Good-bye friend. You'll be missed.

Vincent