Sunday, June 30, 2024

500k

500k.  An interesting milestone. This figure comes from the Springer-Verlag site https://link.springer.com/book/10.1007/978-1-4842-0070-4. I was asked by a colleague how many of the free Kindle copies have been downloaded from Amazon https://www.amazon.com/Embedded-Firmware-Solutions-Development-Practices-ebook/dp/B01JC1LDTY and I didn't have any idea.  Probably a multiple of this number given the paucity of free books in this category?

Either way, the milestone generates a few thoughts. One is a reminder writing technical books isn't about generating large incomes from their sales. A recent Hacker news thread https://news.ycombinator.com/item?id=40830332 and its associated article https://architectelevator.com/strategy/book-author-economics/ are a reminder of this.

"My motivation for writing the book was never the money, and I've generally treated the royalties as a nice bonus. I started writing because I cared a lot about the technology, and I wanted to share it with other people. Writing the book was my way of contributing something to a community that I'd benefited from a lot in my career." 

Another memory includes a dual perspective to the 'open platform blog' http://vzimmer.blogspot.com/2023/05/open-platforms-snapshot-may-2023.html, namely the binary dimension. If you recall from that posting I cite the open source presentation that included the line "Minimize IP components in binary like Intel FSP." So the FSP evolution was always the binary portion of having the open platform code based full solution. A rough roadmap of this work leading up to 2022 can be found in https://link.springer.com/chapter/10.1007/978-1-4842-7939-7_5


As a refresher, in 2014 we were faced with how to support platform code of both coreboot https://www.intel.com/content/www/us/en/developer/articles/tool/coreboot.html and EDKII-ilk https://www.intel.com/content/www/us/en/developer/articles/tool/unified-extensible-firmware-interface.html. The proposal of the multi-division working group I started then included the approach show in https://www.intel.com/content/dam/develop/external/us/en/documents/sf14-stts001-820295.pdf.


The IOT division (now NEX) was already leaning in to using FSP but they mixed the SOC specific details and the API. One of the first things we did was to split out the interface from the SOC-specific implementation. This led to the series of FSP External Architecture Specifications (EAS) found at https://github.com/intel/fsp/wiki and the 'integration guides' found on https://github.com/intel/fsp, such as https://github.com/intel/FSP/blob/master/RaptorLakeFspBinPkg/Docs/AlderLake_FSP_Integration_Guide.pdf

As part of the journey of making community based development less difficult, I was able to clean up the license of the FSP from a 10-page click-through to a simple one based upon the microcode license https://www.phoronix.com/news/Intel-Better-FSP-License https://mail.coreboot.org/pipermail/coreboot/2018-August/087220.html

With FSP2.0 we introduced the FSP-T, FSP-M, and FSP-S to support the non-memory mapped boot map of Apollo Lake (a topology described in https://cdrdv2.intel.com/v1/dl/getContent/671281), and 2.1 introduced dispatch mode for easier integration in a native EDKII environment. The original way to interface with the Intel FSP used by coreboot and slim bootloader is called API mode.

All along the way the FSP's themselves were based upon a mixture of closed source EDKII style silicon code and open source EDKII infrastructure, as exemplified by the https://github.com/tianocore/edk2/tree/master/IntelFsp2Pkg.

So you will see that the timeline above from the 2022 book stops with FSP2.3.  Since then we dropped the FSP 2.4 specification. 2.4 was a pretty radical change to FSP that added things like 64-bit support, SMM encapsulation, cooperative state storage, and additional multi-phase. These FSP changes were part of the broader Universal Scalable Firmware (USF) effort https://universalscalablefirmware.github.io/documentation/8_scalable_fsp.html#sfsp-interactions.  

USF https://www.intel.com/content/www/us/en/developer/articles/technical/universal-scalable-firmware.html was for a while called 'SubZero' to compose as part of the larger oneAPI effort publicly discussed by Raja at https://www.anandtech.com/show/15990/hot-chips-2020-live-blog-intels-raja-koduri-keynote-200pm-pt 


(BTW - this hierarchy also explains the challenges in writing a firmware technical book)


Idea was to have a 'sub zero' or 'level -1' as distinct form the level 0 device driver work of oneAPI https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html#gs.7vlscg

The USF stack entailed breaking up the specific concerns of SOC, platform, and boot technology, as shown in figure 


https://universalscalablefirmware.github.io/documentation/1_terminology.html. And unlike the 2014 IDF presentation that just showed FSP supporting coreboot and EDKII, USF vied to support additional platform code technologies, such as https://github.com/slimbootloader/slimbootloader and even the pure-Rust based https://github.com/oreboot/oreboot, at least until the latter removed their FSP support https://github.com/oreboot/oreboot/tree/remove-vendorcode-fsp in order to keep the project based purely on open sources.

This narrative isn't just my perspective. J. Zhang from Meta had written the following 


 in https://link.springer.com/book/10.1007/978-1-4842-7939-7


It's interesting that parties outside of my company use 'OSF' (i.e., Open Source Firmware) acronym a lot that I'm sometimes surprised in that I rarely if ever hear the term within the corporate walls. 

To me the important part of doing USF was the openness, including POC's and specification drafts at https://github.com/universalscalablefirmware. For example, we fabricated the FSP 2.4 changes for 64-bit at https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_x64, YAML-based configuration (versus bespoke BSF) https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_yaml, SMM encapsulation in FSP https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_x64_smm (originally inspired by https://www.intel.com/content/www/us/en/content-details/671459/a-tour-beyond-launching-standalone-smm-drivers-in-the-pei-phase-using-the-edk-ii.html), and a 'bootable FSP' or FSP@Reset or 'FSP-R' https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_at_reset.

During the early days of FSP we didn't just document the platform code usage to 'consume' the FSP but we also explicated how to 'produce' and FSP (i.e., the type of recipe used in the FSP QEMU instances above) https://www.intel.com/content/www/us/en/content-details/671448/tour-beyond-bios-creating-the-intel-firmware-support-package-version-1-1-with-the-edk-ii.html 

And now in 2024 the bottom 3 of the collaborators are at Microsoft. Quite the change over time.

Additional information on USF can be found at https://github.com/UniversalScalableFirmware/Introduction/blob/main/USF_Overview.pdfhttps://www.osfc.io/2021/talks/an-evolutionary-approach-to-system-firmware/, and https://uefi.org/sites/default/files/resources/USF_Security_Webinar_Final.pdf.

We even described about how to have shareable C code, the predominate language of EDKII, coreboot, and slim bootloader, with Rust https://universalscalablefirmware.github.io/documentation/3_platform_orchestration_layer.html#shareable-platform-code-rust-binding-api

Speaking of Rust, the recently published https://techcommunity.microsoft.com/t5/surface-it-pro-blog/surface-uefi-evolution-in-boot-security-amp-device-management-to/ba-p/4159998 on MS  Rust support generated a few questions to me recently. I’m a fan of moving firmware into Rust in addition to other defense in depth (isolation, ISA mitigations, etc). We did an initial integration of Rust into EDKII https://github.com/tianocore/edk2-staging/tree/edkii-rust 5 years ago described in https://uefi.org/sites/default/files/resources/Enabling%20RUST%20for%20UEFI%20Firmware_8.19.2020.pdf  and https://cfp.osfc.io/media/osfc2020/submissions/SLFJTN/resources/OSFC2020_Rust_EFI_Yao_Zimmer_NDK4Dme.pdf.  We also provided guidance on Rust for firmware in one of our book chapters https://link.springer.com/chapter/10.1007/978-1-4842-6106-4_20

There is also the camp of using 'modern C++' as another memory safe language like Rust https://www.cisa.gov/news-events/news/urgent-need-memory-safety-software-products for systems programming. I'm open to smart pointers and other idioms of those applied to firmware, but the same issue of the 'unsafe UEFI protocols' with their raw pointers will have the safety scoped to only the interior of PEIMs, DXE drivers, UEFI drivers, and UEFI applications, respectively. 

The tianocore community ended up not pushing the Rust work into EDKII upstream for various reasons (people/value/feedback), including no one wanting to invest in the EDKII build system and drive an integration like this. Later work with Google Summer of Code yielded getting the UEFI Rust Crate up streamed https://crates.io/crates/uefi. This allows for building stand-alone .efi images with this crate and including the resultant binary into EDKII full firmware integration.  This latter approach allows community to leverage the goodness of the Rust ecosystem that is vibrant/supported/growing – Cargo, libraries of crates, auto test and doc generation, etc – and avoid some of the vagaries of the EDKII native build system.

In addition to the API changes, the provenance of firmware was a design point. As such, we created the https://www.intel.com/content/www/us/en/content-details/644001/content-details.html specification to describe how how to create manifests and measurements for the FSP and do the corresponding work for the Universal Payload (UPL) https://universalscalablefirmware.github.io/documentation/5_security.html#universal-payload-measurement. UPL is another aspect of the USF work that provides interoperability between how to boot, whether a UEFI style boot with the EDKII payload package, LinuxBoot, or an embedded hypervisor or RTOS. This type of layering for a very diffuse supply chain is akin to attempts like https://android-developers.googleblog.com/2017/05/here-comes-treble-modular-base-for.html. Just as the Android userland should be platform independent, there is a similar demarcation in UEFI where the bulk of the DXE drivers for UEFI compatibility is platform independent https://github.com/tianocore/edk2/tree/master/UefiPayloadPkg, with the same argument holding for a more generic Linux kernel for LinuxBoot https://www.linuxboot.org/.

Speaking of FSP 2.4, in postings the 64-bit work gets a call-out from Google in https://www.phoronix.com/news/Chrome-64-bit-Firmware-Adapt https://blog.osfw.foundation/chrome-ap-firmware-adopting-to-x86_64-architecture/. It still feels like yesterday when I coded up the first PEI code https://github.com/tianocore/edk2/tree/master/MdeModulePkg/Core/DxeIplPeim code to transition to a 64-bit DXE from a 32-bit PEIM 20 years ago. Given our small amount of cache-as-RAM at the time it seemed otherworldly to imagine moving both PEI and DXE to 64-bit at that time, so we opted for the 32-bit PEIM and 64-bit DXE we have had up to today. I also recall looking at the sample code of the AMD64 data book at the time to inspire some of this machine transition code creation. 

Although most of the posted FSP's are client and microserver at https://github.com/intel/fsp

big core Xeon is joining the list. 

Specifically the use of FSP for Xeon gets mention in https://www.phoronix.com/news/Bytedance-CloudFW-Open-Source https://bytedance.larkoffice.com/file/boxcnIHvljaKfN2EaEr0H2ZMzyg and has made progress with https://github.com/intel/FSP/tree/master/EagleStreamFspBinPkg and associated open source platform code at https://github.com/coreboot/coreboot/tree/main/src/soc/intel/xeon_sp, including the Eagle Stream mentioned above https://github.com/coreboot/coreboot/tree/main/src/soc/intel/xeon_sp/spr and the upcoming GNR https://github.com/coreboot/coreboot/tree/main/src/soc/intel/xeon_sp/gnr. The spr coreboot workflow has a nice overview at https://www.intel.com/content/www/us/en/content-details/778593/coreboot-practice-on-eagle-stream.html, too.

AMD has been working on open sourcing coreboot code for their Epyc servers, with https://github.com/coreboot/coreboot/tree/main/src/soc/amd/genoa_poc that leverages the https://github.com/openSIL/openSIL libraries (sort of like an open source variant of Intel's FSP-S code) and binaries posted https://github.com/amd/firmware_binaries/tree/main/genoa/PSP

Open source platform code is interesting. It may offer sustainability options, such as creating your own firmware for a decommissioned server board, or one for which ownership has been transferred. The concept of ownership transfer can be found in work at the OCP https://www.opencompute.org/documents/ibm-white-paper-ownership-and-control-of-firmware-in-open-compute-project-devices. This type of sentiment of part of the circular economy thinking.

Speaking of servers, I joined Intel to lead the 64-bit Merced firmware. We launched EFI on that platform but built upon SAL and PC/AT BIOS. Afterward when Tiano and the Framework-based EDK code was developed for a full platform initialization, I was asked to lead getting the first IA32 Xeon product to adopt the technology. It was the Blackford https://ark.intel.com/content/www/us/en/ark/products/27746/intel-5000p-memory-controller.html chipset-based platform. There was immense push-back from the internal teams to EDK and EFI in general. Originally we thought servers would embrace EFI for use-cases like provisioning, etc., but it turned out servers were the most conservative product category at often last to change.

If you made it this far I apologize. This is the type of blog you get when I camel up a lot of thoughts and don't commit to a final draft, I suppose, for some months. And to continue the meandering, one other sentiment that the above history of crafting firmware specifications reminds me of is how informal, semi-formal, and formal techniques can be applied to this domain going forward. I was reminded of this imperative by the quotation:

"If you’re a software engineer, especially one working on large-scale systems, distributed systems, or critical low-level system, and are not using formal methods as part of your approach, you’re probably wasting time and money. Because, ultimately, engineering is an exercise in optimizing for time and money1." https://brooker.co.za/blog/2024/04/17/formal.html  

I often tell folks that engineers are like applied economists.  Sufficient outcome for the lowest cost. This a another trope along with my 'business/team/career' hierarchy of importance I often quote.

And speaking of another Seattle data point beyond Amazon's Brooker quotation above, I am sad to see that the computer history museum I mentioned 6 years ago http://vzimmer.blogspot.com/2018/06/ is going away https://www.geekwire.com/2024/seattles-living-computers-museum-logs-off-for-good-as-paul-allen-estate-will-auction-vintage-items/. UW hosted an event at the museum after Allen's donation ended up renaming the school in his name. Sadly he passed away a few months later. With the following COVID and settling of his estate, it appears that the museum is a victim of the times.

On a brighter note, I was happy to see another local, Microsoft's Dave Thaler https://www.microsoft.com/en-us/research/people/dthaler/

appear in the eBPF https://ebpf.io/ documentary https://www.youtube.com/watch?v=Wb_vD3XZYOA. I worked with Dave in the late 2000's on evolving UEFI network boot to IPV6 https://www.rfc-editor.org/rfc/rfc5970.txt https://www.ietf.org/archive/id/draft-zimmer-dhc-dhcpv6-remote-boot-options-01.txt.  He looks largely the same as when we were drafting the RFC in his MSFT office or co-presenting at some IETF session. I wish I could say the same about myself. And of course the other notable figure from that documentary who now works at Intel and with whom I had the chance to collaborate  https://patents.google.com/patent/US20240143341A1/en

is the compute performance guru Brendan Gregg https://www.brendangregg.com/. Given his office in Australia I am dubious about f2f co-work opportunities, though, as I had with Dave.

Well, enough for June. Here's looking forward to some thoughts in the upcoming months.

PS
I still need to reconcile my usage of other sites versus blogger. I snapped a couple of conversations since I think the free/community version of Slack removes content after some time window (90 days?).

Specifically, here are some responses I posted on the OSFC slack channel in response to queries, viz.,

https://app.slack.com/client/T0RASQBGW/C9ZLS0U4F


I can understand your confusion.  The answer is mostly #3.Per your question - the typical model is for a hardware root of trust (Intel BtG, AMD PSP, etc) to verify the firmware volume w/ SEC+ PEI code, or "Initial Boot Block" (IBB) via a hash comparison.  Then the IBB code has a library to do verification of the OBB via another hash comparison via code like https://github.com/tianocore/edk2/blob/master/SecurityPkg/FvReportPei/FvReportPei.c.  The OBB is another firmware volume.  The OBB contains DXE and the UEFI Secure boot logic.  The code in the OBB then validates 3rd party UEFI drivers in option ROMs and UEFI images on disk or network via assymetric crypto verification of the Authenticode-based signed PE's.  You can see all of this put together in https://tianocore-docs.github.io/Understanding_UEFI_Secure_Boot_Chain/draft/secure_boot_chain_in_uefi/boot_chain__putting_it_all_together.htmlPS
The UEFI Spec and its 'Secure boot' (really a mistake made by some folks marketing windows.  The 'secure boot' section was about network auth protocol and the pe/coff signing really didn't get read in until https://uefi.org/specs/UEFI/2.10/32_Secure_Boot_and_Driver_Signing.html#uefi-driver-signing-overview).  In general it was a booboo to even call 32.1 'secure', but that's a sin of decades past now.Also, I originally hoped to do per PEIM and per DXE validation as noted 20 years ago in https://www.researchgate.net/publication/377810413_TechnologyIntel_Magazine_-_Advances_[…]atform_Firmware_Beyond_BIOS_and_Across_all_Intel_R_Silicon with sentence "The Framework and EFI drivers may optionally be
cryptographically validated before use to ensure that a chain of trust exists from power-on until the OS boots and
beyond."  Framework https://www.intel.fr/content/dam/doc/product-specification/efi-driver-execution-interface-dxe-cis-specification.pdf was the name of PI specs before they were donated/std'ized in UEFI Forum as the Platform Initialization (PI) specs.  The thinking was PEIM and DXE binaries could be sourced from different vendors, whereas today most people build their PEI and DXE from source.  It's the UEFI drivers and Apps that are ingested as 3rd party binaries given the different between OEM's (PI code), IHV's (adapter card UEFI drivers), and OSV's (OS loaders) in the supply chain.


https://app.slack.com/client/T0RASQBGW/C0RAR7JRM



The UEFI PI spec defines a dependency expression (depex) https://uefi.org/specs/PI/1.8A/V2_DXE_Dispatcher.html#dependency-expressions section in the firmware file or a PEIM or DXE driver that has an RPN encoding of the ppi or protocol consumed by a module.  The PEI and DXE cores use the depex to see if the required PPI's or Protocols have been published prior to dispatching a PEIM or DXE driver.That's the standards side.  On the code side, the EDKII implementation .inf consumes and produces are not used to generate the dependency expression. The .inf file for a given module has the expression under the '[Depex]' portion of the file https://tianocore-docs.github.io/edk2-InfSpecification/draft/2_inf_overview/215_[depex]_section.html#215-depex-section.  These are manually created since the developer can conditionally depend upon other ppis/protocols (imagine control flow based upon some platform state such as a GPIO asserted that tells code whether or not to invoke some 'recovery' PPI/protocol).  That's why you see things like "SOMETIMES_CONSUMES" in files like https://github.com/tianocore/edk2/blob/master/SecurityPkg/Tcg/Tcg2Dxe/Tcg2Dxe.inf

No comments: