Sunday, June 30, 2024

500k

500k.  An interesting milestone. This figure comes from the Springer-Verlag site https://link.springer.com/book/10.1007/978-1-4842-0070-4. I was asked by a colleague how many of the free Kindle copies have been downloaded from Amazon https://www.amazon.com/Embedded-Firmware-Solutions-Development-Practices-ebook/dp/B01JC1LDTY and I didn't have any idea.  Probably a multiple of this number given the paucity of free books in this category?

Either way, the milestone generates a few thoughts. One is a reminder writing technical books isn't about generating large incomes from their sales. A recent Hacker news thread https://news.ycombinator.com/item?id=40830332 and its associated article https://architectelevator.com/strategy/book-author-economics/ are a reminder of this.

"My motivation for writing the book was never the money, and I've generally treated the royalties as a nice bonus. I started writing because I cared a lot about the technology, and I wanted to share it with other people. Writing the book was my way of contributing something to a community that I'd benefited from a lot in my career." 

Another memory includes a dual perspective to the 'open platform blog' http://vzimmer.blogspot.com/2023/05/open-platforms-snapshot-may-2023.html, namely the binary dimension. If you recall from that posting I cite the open source presentation that included the line "Minimize IP components in binary like Intel FSP." So the FSP evolution was always the binary portion of having the open platform code based full solution. A rough roadmap of this work leading up to 2022 can be found in https://link.springer.com/chapter/10.1007/978-1-4842-7939-7_5


As a refresher, in 2014 we were faced with how to support platform code of both coreboot https://www.intel.com/content/www/us/en/developer/articles/tool/coreboot.html and EDKII-ilk https://www.intel.com/content/www/us/en/developer/articles/tool/unified-extensible-firmware-interface.html. The proposal of the multi-division working group I started then included the approach show in https://www.intel.com/content/dam/develop/external/us/en/documents/sf14-stts001-820295.pdf.


The IOT division (now NEX) was already leaning in to using FSP but they mixed the SOC specific details and the API. One of the first things we did was to split out the interface from the SOC-specific implementation. This led to the series of FSP External Architecture Specifications (EAS) found at https://github.com/intel/fsp/wiki and the 'integration guides' found on https://github.com/intel/fsp, such as https://github.com/intel/FSP/blob/master/RaptorLakeFspBinPkg/Docs/AlderLake_FSP_Integration_Guide.pdf

As part of the journey of making community based development less difficult, I was able to clean up the license of the FSP from a 10-page click-through to a simple one based upon the microcode license https://www.phoronix.com/news/Intel-Better-FSP-License https://mail.coreboot.org/pipermail/coreboot/2018-August/087220.html

With FSP2.0 we introduced the FSP-T, FSP-M, and FSP-S to support the non-memory mapped boot map of Apollo Lake (a topology described in https://cdrdv2.intel.com/v1/dl/getContent/671281), and 2.1 introduced dispatch mode for easier integration in a native EDKII environment. The original way to interface with the Intel FSP used by coreboot and slim bootloader is called API mode.

All along the way the FSP's themselves were based upon a mixture of closed source EDKII style silicon code and open source EDKII infrastructure, as exemplified by the https://github.com/tianocore/edk2/tree/master/IntelFsp2Pkg.

So you will see that the timeline above from the 2022 book stops with FSP2.3.  Since then we dropped the FSP 2.4 specification. 2.4 was a pretty radical change to FSP that added things like 64-bit support, SMM encapsulation, cooperative state storage, and additional multi-phase. These FSP changes were part of the broader Universal Scalable Firmware (USF) effort https://universalscalablefirmware.github.io/documentation/8_scalable_fsp.html#sfsp-interactions.  

USF https://www.intel.com/content/www/us/en/developer/articles/technical/universal-scalable-firmware.html was for a while called 'SubZero' to compose as part of the larger oneAPI effort publicly discussed by Raja at https://www.anandtech.com/show/15990/hot-chips-2020-live-blog-intels-raja-koduri-keynote-200pm-pt 


(BTW - this hierarchy also explains the challenges in writing a firmware technical book)


Idea was to have a 'sub zero' or 'level -1' as distinct form the level 0 device driver work of oneAPI https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html#gs.7vlscg

The USF stack entailed breaking up the specific concerns of SOC, platform, and boot technology, as shown in figure 


https://universalscalablefirmware.github.io/documentation/1_terminology.html. And unlike the 2014 IDF presentation that just showed FSP supporting coreboot and EDKII, USF vied to support additional platform code technologies, such as https://github.com/slimbootloader/slimbootloader and even the pure-Rust based https://github.com/oreboot/oreboot, at least until the latter removed their FSP support https://github.com/oreboot/oreboot/tree/remove-vendorcode-fsp in order to keep the project based purely on open sources.

This narrative isn't just my perspective. J. Zhang from Meta had written the following 


 in https://link.springer.com/book/10.1007/978-1-4842-7939-7


It's interesting that parties outside of my company use 'OSF' (i.e., Open Source Firmware) acronym a lot that I'm sometimes surprised in that I rarely if ever hear the term within the corporate walls. 

To me the important part of doing USF was the openness, including POC's and specification drafts at https://github.com/universalscalablefirmware. For example, we fabricated the FSP 2.4 changes for 64-bit at https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_x64, YAML-based configuration (versus bespoke BSF) https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_yaml, SMM encapsulation in FSP https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_x64_smm (originally inspired by https://www.intel.com/content/www/us/en/content-details/671459/a-tour-beyond-launching-standalone-smm-drivers-in-the-pei-phase-using-the-edk-ii.html), and a 'bootable FSP' or FSP@Reset or 'FSP-R' https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_at_reset.

During the early days of FSP we didn't just document the platform code usage to 'consume' the FSP but we also explicated how to 'produce' and FSP (i.e., the type of recipe used in the FSP QEMU instances above) https://www.intel.com/content/www/us/en/content-details/671448/tour-beyond-bios-creating-the-intel-firmware-support-package-version-1-1-with-the-edk-ii.html 

And now in 2024 the bottom 3 of the collaborators are at Microsoft. Quite the change over time.

Additional information on USF can be found at https://github.com/UniversalScalableFirmware/Introduction/blob/main/USF_Overview.pdfhttps://www.osfc.io/2021/talks/an-evolutionary-approach-to-system-firmware/, and https://uefi.org/sites/default/files/resources/USF_Security_Webinar_Final.pdf.

We even described about how to have shareable C code, the predominate language of EDKII, coreboot, and slim bootloader, with Rust https://universalscalablefirmware.github.io/documentation/3_platform_orchestration_layer.html#shareable-platform-code-rust-binding-api

Speaking of Rust, the recently published https://techcommunity.microsoft.com/t5/surface-it-pro-blog/surface-uefi-evolution-in-boot-security-amp-device-management-to/ba-p/4159998 on MS  Rust support generated a few questions to me recently. I’m a fan of moving firmware into Rust in addition to other defense in depth (isolation, ISA mitigations, etc). We did an initial integration of Rust into EDKII https://github.com/tianocore/edk2-staging/tree/edkii-rust 5 years ago described in https://uefi.org/sites/default/files/resources/Enabling%20RUST%20for%20UEFI%20Firmware_8.19.2020.pdf  and https://cfp.osfc.io/media/osfc2020/submissions/SLFJTN/resources/OSFC2020_Rust_EFI_Yao_Zimmer_NDK4Dme.pdf.  We also provided guidance on Rust for firmware in one of our book chapters https://link.springer.com/chapter/10.1007/978-1-4842-6106-4_20

There is also the camp of using 'modern C++' as another memory safe language like Rust https://www.cisa.gov/news-events/news/urgent-need-memory-safety-software-products for systems programming. I'm open to smart pointers and other idioms of those applied to firmware, but the same issue of the 'unsafe UEFI protocols' with their raw pointers will have the safety scoped to only the interior of PEIMs, DXE drivers, UEFI drivers, and UEFI applications, respectively. 

The tianocore community ended up not pushing the Rust work into EDKII upstream for various reasons (people/value/feedback), including no one wanting to invest in the EDKII build system and drive an integration like this. Later work with Google Summer of Code yielded getting the UEFI Rust Crate up streamed https://crates.io/crates/uefi. This allows for building stand-alone .efi images with this crate and including the resultant binary into EDKII full firmware integration.  This latter approach allows community to leverage the goodness of the Rust ecosystem that is vibrant/supported/growing – Cargo, libraries of crates, auto test and doc generation, etc – and avoid some of the vagaries of the EDKII native build system.

In addition to the API changes, the provenance of firmware was a design point. As such, we created the https://www.intel.com/content/www/us/en/content-details/644001/content-details.html specification to describe how how to create manifests and measurements for the FSP and do the corresponding work for the Universal Payload (UPL) https://universalscalablefirmware.github.io/documentation/5_security.html#universal-payload-measurement. UPL is another aspect of the USF work that provides interoperability between how to boot, whether a UEFI style boot with the EDKII payload package, LinuxBoot, or an embedded hypervisor or RTOS. This type of layering for a very diffuse supply chain is akin to attempts like https://android-developers.googleblog.com/2017/05/here-comes-treble-modular-base-for.html. Just as the Android userland should be platform independent, there is a similar demarcation in UEFI where the bulk of the DXE drivers for UEFI compatibility is platform independent https://github.com/tianocore/edk2/tree/master/UefiPayloadPkg, with the same argument holding for a more generic Linux kernel for LinuxBoot https://www.linuxboot.org/.

Speaking of FSP 2.4, in postings the 64-bit work gets a call-out from Google in https://www.phoronix.com/news/Chrome-64-bit-Firmware-Adapt https://blog.osfw.foundation/chrome-ap-firmware-adopting-to-x86_64-architecture/. It still feels like yesterday when I coded up the first PEI code https://github.com/tianocore/edk2/tree/master/MdeModulePkg/Core/DxeIplPeim code to transition to a 64-bit DXE from a 32-bit PEIM 20 years ago. Given our small amount of cache-as-RAM at the time it seemed otherworldly to imagine moving both PEI and DXE to 64-bit at that time, so we opted for the 32-bit PEIM and 64-bit DXE we have had up to today. I also recall looking at the sample code of the AMD64 data book at the time to inspire some of this machine transition code creation. 

Although most of the posted FSP's are client and microserver at https://github.com/intel/fsp

big core Xeon is joining the list. 

Specifically the use of FSP for Xeon gets mention in https://www.phoronix.com/news/Bytedance-CloudFW-Open-Source https://bytedance.larkoffice.com/file/boxcnIHvljaKfN2EaEr0H2ZMzyg and has made progress with https://github.com/intel/FSP/tree/master/EagleStreamFspBinPkg and associated open source platform code at https://github.com/coreboot/coreboot/tree/main/src/soc/intel/xeon_sp, including the Eagle Stream mentioned above https://github.com/coreboot/coreboot/tree/main/src/soc/intel/xeon_sp/spr and the upcoming GNR https://github.com/coreboot/coreboot/tree/main/src/soc/intel/xeon_sp/gnr. The spr coreboot workflow has a nice overview at https://www.intel.com/content/www/us/en/content-details/778593/coreboot-practice-on-eagle-stream.html, too.

AMD has been working on open sourcing coreboot code for their Epyc servers, with https://github.com/coreboot/coreboot/tree/main/src/soc/amd/genoa_poc that leverages the https://github.com/openSIL/openSIL libraries (sort of like an open source variant of Intel's FSP-S code) and binaries posted https://github.com/amd/firmware_binaries/tree/main/genoa/PSP

Open source platform code is interesting. It may offer sustainability options, such as creating your own firmware for a decommissioned server board, or one for which ownership has been transferred. The concept of ownership transfer can be found in work at the OCP https://www.opencompute.org/documents/ibm-white-paper-ownership-and-control-of-firmware-in-open-compute-project-devices. This type of sentiment of part of the circular economy thinking.

Speaking of servers, I joined Intel to lead the 64-bit Merced firmware. We launched EFI on that platform but built upon SAL and PC/AT BIOS. Afterward when Tiano and the Framework-based EDK code was developed for a full platform initialization, I was asked to lead getting the first IA32 Xeon product to adopt the technology. It was the Blackford https://ark.intel.com/content/www/us/en/ark/products/27746/intel-5000p-memory-controller.html chipset-based platform. There was immense push-back from the internal teams to EDK and EFI in general. Originally we thought servers would embrace EFI for use-cases like provisioning, etc., but it turned out servers were the most conservative product category at often last to change.

If you made it this far I apologize. This is the type of blog you get when I camel up a lot of thoughts and don't commit to a final draft, I suppose, for some months. And to continue the meandering, one other sentiment that the above history of crafting firmware specifications reminds me of is how informal, semi-formal, and formal techniques can be applied to this domain going forward. I was reminded of this imperative by the quotation:

"If you’re a software engineer, especially one working on large-scale systems, distributed systems, or critical low-level system, and are not using formal methods as part of your approach, you’re probably wasting time and money. Because, ultimately, engineering is an exercise in optimizing for time and money1." https://brooker.co.za/blog/2024/04/17/formal.html  

I often tell folks that engineers are like applied economists.  Sufficient outcome for the lowest cost. This a another trope along with my 'business/team/career' hierarchy of importance I often quote.

And speaking of another Seattle data point beyond Amazon's Brooker quotation above, I am sad to see that the computer history museum I mentioned 6 years ago http://vzimmer.blogspot.com/2018/06/ is going away https://www.geekwire.com/2024/seattles-living-computers-museum-logs-off-for-good-as-paul-allen-estate-will-auction-vintage-items/. UW hosted an event at the museum after Allen's donation ended up renaming the school in his name. Sadly he passed away a few months later. With the following COVID and settling of his estate, it appears that the museum is a victim of the times.

On a brighter note, I was happy to see another local, Microsoft's Dave Thaler https://www.microsoft.com/en-us/research/people/dthaler/

appear in the eBPF https://ebpf.io/ documentary https://www.youtube.com/watch?v=Wb_vD3XZYOA. I worked with Dave in the late 2000's on evolving UEFI network boot to IPV6 https://www.rfc-editor.org/rfc/rfc5970.txt https://www.ietf.org/archive/id/draft-zimmer-dhc-dhcpv6-remote-boot-options-01.txt.  He looks largely the same as when we were drafting the RFC in his MSFT office or co-presenting at some IETF session. I wish I could say the same about myself. And of course the other notable figure from that documentary who now works at Intel and with whom I had the chance to collaborate  https://patents.google.com/patent/US20240143341A1/en

is the compute performance guru Brendan Gregg https://www.brendangregg.com/. Given his office in Australia I am dubious about f2f co-work opportunities, though, as I had with Dave.

Well, enough for June. Here's looking forward to some thoughts in the upcoming months.

PS
I still need to reconcile my usage of other sites versus blogger. I snapped a couple of conversations since I think the free/community version of Slack removes content after some time window (90 days?).

Specifically, here are some responses I posted on the OSFC slack channel in response to queries, viz.,

https://app.slack.com/client/T0RASQBGW/C9ZLS0U4F


I can understand your confusion.  The answer is mostly #3.Per your question - the typical model is for a hardware root of trust (Intel BtG, AMD PSP, etc) to verify the firmware volume w/ SEC+ PEI code, or "Initial Boot Block" (IBB) via a hash comparison.  Then the IBB code has a library to do verification of the OBB via another hash comparison via code like https://github.com/tianocore/edk2/blob/master/SecurityPkg/FvReportPei/FvReportPei.c.  The OBB is another firmware volume.  The OBB contains DXE and the UEFI Secure boot logic.  The code in the OBB then validates 3rd party UEFI drivers in option ROMs and UEFI images on disk or network via assymetric crypto verification of the Authenticode-based signed PE's.  You can see all of this put together in https://tianocore-docs.github.io/Understanding_UEFI_Secure_Boot_Chain/draft/secure_boot_chain_in_uefi/boot_chain__putting_it_all_together.htmlPS
The UEFI Spec and its 'Secure boot' (really a mistake made by some folks marketing windows.  The 'secure boot' section was about network auth protocol and the pe/coff signing really didn't get read in until https://uefi.org/specs/UEFI/2.10/32_Secure_Boot_and_Driver_Signing.html#uefi-driver-signing-overview).  In general it was a booboo to even call 32.1 'secure', but that's a sin of decades past now.Also, I originally hoped to do per PEIM and per DXE validation as noted 20 years ago in https://www.researchgate.net/publication/377810413_TechnologyIntel_Magazine_-_Advances_[…]atform_Firmware_Beyond_BIOS_and_Across_all_Intel_R_Silicon with sentence "The Framework and EFI drivers may optionally be
cryptographically validated before use to ensure that a chain of trust exists from power-on until the OS boots and
beyond."  Framework https://www.intel.fr/content/dam/doc/product-specification/efi-driver-execution-interface-dxe-cis-specification.pdf was the name of PI specs before they were donated/std'ized in UEFI Forum as the Platform Initialization (PI) specs.  The thinking was PEIM and DXE binaries could be sourced from different vendors, whereas today most people build their PEI and DXE from source.  It's the UEFI drivers and Apps that are ingested as 3rd party binaries given the different between OEM's (PI code), IHV's (adapter card UEFI drivers), and OSV's (OS loaders) in the supply chain.


https://app.slack.com/client/T0RASQBGW/C0RAR7JRM



The UEFI PI spec defines a dependency expression (depex) https://uefi.org/specs/PI/1.8A/V2_DXE_Dispatcher.html#dependency-expressions section in the firmware file or a PEIM or DXE driver that has an RPN encoding of the ppi or protocol consumed by a module.  The PEI and DXE cores use the depex to see if the required PPI's or Protocols have been published prior to dispatching a PEIM or DXE driver.That's the standards side.  On the code side, the EDKII implementation .inf consumes and produces are not used to generate the dependency expression. The .inf file for a given module has the expression under the '[Depex]' portion of the file https://tianocore-docs.github.io/edk2-InfSpecification/draft/2_inf_overview/215_[depex]_section.html#215-depex-section.  These are manually created since the developer can conditionally depend upon other ppis/protocols (imagine control flow based upon some platform state such as a GPIO asserted that tells code whether or not to invoke some 'recovery' PPI/protocol).  That's why you see things like "SOMETIMES_CONSUMES" in files like https://github.com/tianocore/edk2/blob/master/SecurityPkg/Tcg/Tcg2Dxe/Tcg2Dxe.inf

Saturday, March 30, 2024

A legend passes

Sad to see the news about Ross Anderson https://en.wikipedia.org/wiki/Ross_J._Anderson passing

https://alecmuffett.com/article/109513

https://news.ycombinator.com/item?id=39864210

https://twitter.com/duncan_2qq/status/1773752269395099774 


Like many I was inspired and informed by his various editions of the "Security Engineering" book https://www.cl.cam.ac.uk/~rja14/book.html. I also explored the domain via papers like https://www.cl.cam.ac.uk/~fms27/papers/1999-StajanoAnd-duckling.pdf that I referenced in https://www.researchgate.net/publication/221199899_Platform_Trust_Beyond_BIOS_Using_the_Unified_Extensible_Firmware_Interface/references. I also cull wisdom from papers like https://www.cl.cam.ac.uk/~rja14/Papers/satan.pdf since having worked on the boundary of software and hardware for so long, sometimes errant hardware or firmware is truly an embodiment of 'Satan's Computer.'

My small interaction with Prof Anderson was during the writing of https://link.springer.com/book/10.1007/978-1-4842-6106-4

 


 My co-author and I reached out to see if Anderson would write a forward, with the below response


Luckily we did get a very insightful write-up


from Leendert Van Doorn https://blog.paramecium.org/about/.

This was an ironic pairing in retrospect seeing Anderson's critiques of Trusted Computing and Leendert's contemporary contributions to that domain, respectively. Having these titans both critique https://www.cl.cam.ac.uk/~rja14/tcpa-faq-1.0.html and build a domain https://www.amazon.com/Practical-Guide-Trusted-Computing/dp/0132398427, like TCPA (now TCG https://en.wikipedia.org/wiki/Trusted_Computing_Group) Trusted Platforms Modules, represent a healthy aspect of technology evolution in my view. Differing views make any technology stronger, versus groupthink & homogeneity of thought.

 Sad times for the security community, though, with the loss of a legend.





Sunday, March 24, 2024

Sneers, CNAs, licenses, and fuzzing

Let's start off with something I occasionally see in industry, namely 'the grand sneer' mentioned in https://buttondown.email/hillelwayne/archive/know-of-the-right-tool-for-the-job/. I sometimes see the 'sneering' if often a sign of youth or narrow experience or not exploring outside of your domain or https://twitter.com/vincentzimmer/status/1762972464169296002... 

The more you know often leads to greater humility borne of realizing how much knowledge there is in the world that you don't know.

Another interesting posting of late was the fact that the Linux kernel is now a CNA https://amanitasecurity.com/posts/dear-linux-kernel-cna-what-have-you-done/ https://news.ycombinator.com/item?id=39627302. I noted that there are similar challenges in other open source infrastructure like https://github.com/tianocore/tianocore.github.io/wiki/Reporting-Security-Issues in https://twitter.com/vincentzimmer/status/1768351312205484380

Another posting in that thread clicked into the SBOM topic with an advocacy for the VEX format. Some work in this space can be found in https://github.com/hughsie/uefi-sbom-best-practices/blob/main/index.rst, too.

So a lot of these thoughts are borne of experience. Amazon has a famous quote that goes something like "there is no compression algorithm for experience," but I'd have to say things are getting pretty good with LLM's. In fact I am glad that my longer form works were published prior to chatGPT.  Maybe the world of text will be bifurcated into BG and PG - "Before GPT" and "After GPT."

I don't subscribe to the dystopian 'paperclip' https://cepr.org/voxeu/columns/ai-and-paperclip-problem style apocalypse of AI but I do admire the foundations upon which these large foundation models are built, namely the sum of human knowledge, or the internet. From the hockey-puck style growth of the net in '97 from the Metacrawler era http://vzimmer.blogspot.com/2021/01/memories-from-uw-and-cornell.html to today's corpus of information on the web, it's truly staggering.

Some examples of oopsies around folks leveraging chatGPT a little too much include https://www.sciencedirect.com/science/article/abs/pii/S2468023024002402 https://simonwillison.net/2024/Mar/15/certainly-here-is-google-scholar/ and https://news.ycombinator.com/item?id=39733605.

Speaking of experience, Subrata made a nice posting https://twitter.com/abarjodi/status/1771948383529247011



namely the "FSP Customization - Remove non-mandatory components in the Intel FSP" for the Open Source Firmware Foundation (OSFC) Byte talks - volume 1, March 8, 2024 https://opensourcefirmware.foundation/events/bytetalks-vol.-1/. The video is now posted at https://www.youtube.com/watch?v=0ciYjPSu56A. This builds on work trying to help the various communities https://www.phoronix.com/news/Google-Intel-More-FSP-Flexible

 https://blog.osfw.foundation/breaking-the-boundary-a-way-to-create-your-own-fsp-binary/. In the past, we responded to the concerns about FSP licensing described in https://www.phoronix.com/news/Intel-Better-FSP-License 

https://mail.coreboot.org/pipermail/coreboot/2018-August/087220.html 


It's hard to 'sneer' when the community is seeing problem statements not necessarily experience in your own environment or workflow. 

Sometimes folks don't sneer but ignore. For example the use of SIMICs https://github.com/intel/tsffs for fuzzing firmware mentioned in https://twitter.com/jerry_Intel/status/1762220373503005056 regrettably didn't cite https://ieeexplore.ieee.org/document/9218694 in their blog https://community.intel.com/t5/Blogs/Products-and-Solutions/Security/Chips-Salsa-This-Hardware-Does-Not-Exist/post/1572067. I ordinarily wouldn't call folks out if it weren't for the fact that in an internal presentation of their work I mentioned the preceding development on UEFI SIMICS fuzzing and the ensuing paper to the TSFFS folks, with a response from the TSFFS lead that "Oh yes, we leveraged that work.  We were disappointed that you published first so that we couldn't." So at least not a sneer :)  

On a more positive note, the team did some great evolution, including extending 'beyond BIOS' use-case, getting it open source, and finally, against many odds within large companies enamored of Python et al these days, evolving the feature to use the Rust language. 

And additional props go out to my former software division that delivered TSFF to the open source for their work in evolving HBFA https://github.com/tianocore/tianocore.github.io/wiki/Host-Based-Firmware-Analyzer with their https://github.com/intel/HBFA-FL project. They did a nice job on ack'ing the earlier work, too https://www.intel.com/content/dam/develop/external/us/en/documents/intel-usinghbfatoimproveplatformresiliency-820238.pdf



Although a lot of the constituent elements like https://github.com/S2E are in the open, I wasn't able to get the symbolic execution work described in https://www.usenix.org/conference/woot15/workshop-program/presentation/bazhaniuk across the open source finish line. The lure of retirement, Amazon, and Eclypsium ended up disbanding that team over time and no new team emerged from the ashes to carry it forward. 



Saturday, February 24, 2024

27 or Anniversary.Next^12, AI, Runtime

Anniversary

True to form, today is my work anniversary. I started Intel on February 24, 1997 This post also builds on my last posting in this vein http://vzimmer.blogspot.com/2023/02/26-or-anniversarynext11-and-wisdom-of.html. At this point I have spent more than half of my life on this planet at this single company.  

Since the last posting I've been back to the office daily, sometimes visiting the Crossroads for lunch, where I had my first sighting of a Cyber Truck.



I'd often work through lunch and eat from the local self-serve facilities but the pickings have been slim. So slim that even a 2-month-expired tuna sandwich was eaten by someone.




 I last took a sabbatical in 2011 where I smashed up my arm on the 2nd day and ended up w/ 2 surgeries. I still recall the one-handed typing away at the ITJ articles https://www.intel.com/content/dam/www/public/us/en/documents/research/2011-vol15-iss-1-intel-technology-journal.pdf that appeared later that year.  Typically sabbaticals expire but with the strangeness of COVID there were extensions that turned into a revised evergreen policy where sabbaticals no longer expire.  Regrettably, though, one stops accruing time after reaching 16 years. Looks like I need to pull the trigger on at least one month in the upcoming year, viz.,

 

Beyond cyber trucks, old sandwiches, and sabbaticals, the last year has seen a lot of energy around AI. Even the University of Washington lecture series has been dominated by this topic. An interesting talk from Meta was included in https://www.ece.uw.edu/news-events/lytle-lecture-series/ with slides https://www.ece.uw.edu/wp-content/uploads/2024/01/lecun-20240124-uw-lyttle.pdf and a recording https://www.youtube.com/watch?v=d_bdU3LsLzE. You can catch me at 6:36 on LHS of the screen


It's always interesting to be there in person. One comment that intrigued me was a comment from Yann LeCun about the Metaverse build out and GPU procurement. I cannot recall if it was on-tape or off-the-record so I've avoid going further than that.



This talk was at the Lyceum HUB but all of the other talks were across the street.



Another interesting talk https://www.cs.washington.edu/events/colloquia/details?id=3310 was from Fei-Fei Li of Stanford and ImageNet https://ieeexplore.ieee.org/document/5206848 fame.
This was near the comp sci building

LeCun's and Li's talks were like Taylor-Swift-concert-crowds but for techies. Nice to see these thought leaders share their insights and experience in person. 

Google Waymo was also in the mix with a talk https://tcat.cs.washington.edu/events/taskar-center-memorial-distinguished-lecture/ on the challenges of autonomous driving and AI https://www.youtube.com/watch?v=pK5ChzMsfE0


So let's pivot from AI lectures to a topic closer to home, namely firmware runtime. I chose this topic since one of the most popular posts is on this blog series was UEFI variable runtime http://vzimmer.blogspot.com/2012/12/accessing-uefi-form-operating-system.html. I suspect this stems from the fact that most folks have interactions during life of platform at runtime. In our quest for 0-second boot this makes even more sense. 

Regarding how firmware is exposed to the operating system, and thus ultimately the user, at runtime, there are various modalities. 

These include:

  • Static info tables
    • Advanced Configuration and Power Interface (ACPI)
    • Flattened Device Tree (FDT)
  • Interpreter bytecode
    • ACPI AML interpreter
    • UEFI EBC sandbox
    • x86 VDM for video int10h calls
  • Native code runtime
    • UEFI runtime
    • Power9 OPAL
    • Platform Runtime Mechanism (PRM)
  • Opaque host modes - synchronous and asynchronous activations

    • SMI(SMM)- x64
    • SMC(TrustZone)- ARM
    • Trap(Machine Mode) - RISC-V

These are all for the host firmware, or firmware running on the main application processor/core.

There is also device firmware running in the SOC.

In addition, at the platform level, the host can signal 'non-host' such as an Embedded Controller (EC) on client and Baseboard Management Controller (BMC) on server.

Below shows some of the latter 

 






 from https://link.springer.com/book/10.1007/978-1-4842-7939-7.


A lot of the flows are blended. For example, some of the descriptions of ACPI and SMI for errors described in https://cdrdv2.intel.com/v1/dl/getContent/671067 but the paper is light on ACPI.

Regrettably the collection at https://www.amazon.com/stores/Vincent-Zimmer/author/B002I6IW4A is pretty spare on ACPI, too, although https://link.springer.com/book/10.1007/978-1-4842-7974-8 covers construction and design in the case studies.

A curation of ACPI material can be found in the following:


Public documents on ACPI

2022

Intro to ACPI from ACPI spec

https://uefi.org/specs/ACPI/6.5/01_Introduction.html


2019 

ACPI tutorial

https://acpica.org/sites/acpica/files/asl_tutorial_v20190625.pdf 


2013

ACPI and UEFI

https://cdrdv2.intel.com/v1/dl/getContent/671067 


ACPI overview https://www.intel.com/content/dam/www/public/us/en/documents/research/2009-vol13-iss-1-intel-technology-journal.pdf


2009

ACPI and SMM

https://link.springer.com/article/10.1007/s11416-009-0138-0 


2006

ACPI attacks

https://www.blackhat.com/presentations/bh-europe-06/bh-eu-06-Heasman.pdf 


2004

ACPI HOWTO

https://tldp.org/HOWTO/pdf/ACPI-HOWTO.pdf


2003

IA64 book, including UEFI and ACPI chapter

https://www.amazon.com/Itanium-Architecture-Programmers-Understanding-Processors/dp/0131013726

 

2001

ACPI Book

https://dl.acm.org/doi/book/10.5555/940719 


1997 

ACPI implementation guide

https://www.baldwin.cx/~phoenix/reference/docs/acpi_impguide.pdf 


1996 for first spec - full history in 
https://en.wikipedia.org/wiki/ACPI 

 
A colleague suggested that I should have spent more time in the books writing about ACPI since he spends his time debugging issues on Windows and can quote many instances of poor or misunderstood ACPI constructions. I also chronicled in the past my exchange https://vzimmer.blogspot.com/2018/ with the Linux kernel leadership on the topic. 

A lot of people grouse about UEFI and ACPI when in fact it isn't the standard they are critiquing but instead the implementation.  Namely, it's often a confusion between interface and implementation. 

For example, on topics like the firmware support package (FSP), I hear complaints. I often offer the following dichotomy there:

    Consumer (caller)

    Interface (specification)

    Producer (implementation)


You can find examples of using FSP (consumer/caller) with EDKII 

Sometimes people grouse about FSP when in fact it's the implementation, not the API. And it has been a long run on FSP. The Intel IOTG folks kicked it off in 2010 and then Jiming and I conspired to have a working group that I've lead/co-lead since 2014 on the same with various other co-leads and collaborators along the way. Some of the outputs of that collaboration that record Maurice, Ravi and Jiming details can be found in the '15 book https://link.springer.com/book/10.1007/978-1-4842-0070-4, too. Maurice is now doing great things in fw at MS and Jiming at AMZN, resp., AFAIK. One of my key contributions in those 2013/2014 days was teasing out the FSP spec which was originally an amalgam of silicon details and API's into a couplet of docs, namely the main interface spec, the FSP EAS, and the respective SOC integration guides, such as can be found today in https://www.intel.com/fsp and in https://www.github.com/intel/FSP, respectively. This allowed for creating class drivers for FSP in the various platform code (e.g., coreboot and EDKII at the time) and abstracting SOC specifics. And I should forget how the decade+ of cross-group collaboration allowed for scaling FSP's from embedded to both mainstream client and servers, as demonstrated by the rich postings on Github. It's not a perfect split between EAS and integration guide, though, but as we've struggled with the 2001 Intel Framework API's and 2006+ UEFI PI spec interfaces, building that 'firmware socket' set of abstractions is tough given the variability of silicon and products over time.

Same with UEFI.  The interface is at https://uefi.org/specifications. Two implementations can be found at https://github.com/tianocore/edk2/tree/master/MdeModulePkg/Core/Dxe and https://github.com/u-boot/u-boot/tree/master/lib/efi, respectively. Most folks complain about the EDKII implementation when they invoke 'UEFI is broken', I feel. This reminds me of the sentiment that some person with the handle of 'the_panopticon' mentioned in https://news.ycombinator.com/item?id=39481434

So back to ACPI. The evolution of ACPI in the did-90's predates late 90's EFI. EFI started as a sample and then the Tiano project with EDK and then EDKII subsumed Framework/PI and EFI/UEFI but not the ACPI specification. As such, no modularization of ACPI from the beginning because of Conway's Law (i.e., the folks who owned the UEFI spec and its reference implementation didn't own the ACPI spec for the first nearly 2 decades). In fact a lot of the ACPI table construction on the Intel platforms was derived was the DaVinci/Kittyhawk clean-room C BIOS that pre-dated Tiano in that hotbed of late 90's BIOS innovation called DuPont, WA Intel site. This model of static tables differs from the ARM ecosystem which does dynamic table generation since they started much later on the journey and had a single-team/Linaro view of entering standards-based firmware.

An advantage of dynamic table generation is flexibility but a downside is that it is harder to do attestation since the table fields are not known at build time. Even build time calculations, though, are aggravated by patching. So in practice it is not really done, just like a lot of configuration and PCR[1] content not really be reconstructed for some attestation / verifier flows.

Thinking about these taxonomies of firmware, from SAL+BIOS to Kittyhawk to ACSFL to EDK to EDKII to slim bootloader to coreboot to.... I realize that I might have the dubious honor of having worked on the broadest variety of host firmware at my employer.

Time marches on. In mentioning DaVinci/Kittyhawk workstation BIOS, I realize that my colleagues on that late 90's adventure have largely left the company, from retirement to downsizing to becoming execs at other tech shops (e.g., MS). Similar to the thinning of the crowd of others, such as even my 2015 colleagues from https://www.usenix.org/conference/woot15/workshop-program/presentation/bazhaniuk who have all left for startups (e.g., Eclypsium) or retirement or other big tech (e.g., Amazon). Well, given those statistics and chaos in tech, this might by be my last Next^* blog. Even if it is, though, I have enjoyed the run and people I've met along the way. Hopefully I have re-payed my employer's and colleagues trust with sufficient contributions these last 27 years.  

Cheers