Tuesday, May 3, 2022

Re-use - ideas, et al.

This is a quick note on 're-use'.  Recall how UEFI was derived from EFI, which in turn started as IBI, as mentioned in in http://vzimmer.blogspot.com/2018/10/ghosts-of.html and various histories of EFI in books http://vzimmer.blogspot.com/2021/01/books-and-computers.html and papers https://www.intel.com/content/dam/www/public/us/en/documents/research/2011-vol15-iss-1-intel-technology-journal.pdf

As a background, the original Intel Boot Initiative had its own filesystem called "IBIFS". It was a simple log-structured filesystem that made for easy parsing. But the challenge with this filesystem was how to provision boot media from other operating systems, such as Windows. In the 1990's MS-FAT was the prevailing filesystem although it was much more complex than IBIFS. The IBI/EFI team was encourage to adopt MS-FAT https://download.microsoft.com/download/1/6/1/161ba512-40e2-4cc9-843a-923143f3456c/fatgen103.doc, along with PE/COFF https://docs.microsoft.com/en-us/windows/win32/debug/pe-format executables, as the filesystem and executable format, respectively. This was supported by having a license to use these specifications for production of EFI firmware. 

The other advantage of using MS filesystem included MS providing tools to format, check and provision disks https://www.intel.com/content/www/us/en/download/714351/uefi-shell-disk-utilities.html whose sources were derived from MS Installable Filesystem (IFS) C++ kit provided to OEM's. This was my first dive into C++-on-a-C-framework (i.e., no global constructors, etc).

And PE/COFF allowed for using Visual Studio and commercial linkers. At the time we had less luck in getting PDB format and KD wire protocol made public. We wouldn't open source the EFI Debug support protocol and nub based upon reversing that infrastructure, of course. 

Of course using ANSI C with PE/COFF was more advantageous than using non-standard art like the coreboot romcc that proved brittle to maintain long-term, although it was a clever solution prior to the advent of cache-as-ram (CAR) https://www.coreboot.org/data/yhlu/cache_as_ram_lb_09142006.pdf. Prior to CAR we had to perform unnatural acts using MMX registers and other on-processor resources (with the peak weirdness of PEI CIS 0.3 and its register 'call levels') in order to implement pre-permanent memory flows like DRAM initialization in the host firmware. 

I was reminded of this logic of re-use based upon a few recent topics, including USF https://github.com/universalscalablefirmware and EDK2 GSoC conversation. On the USF topic, there is a discussion of using a self-describing format like CBOR https://www.rfc-editor.org/rfc/rfc8949.html in lieu of the UEFI HOB's https://universalscalablefirmware.groups.io/g/discussion/files/New%20data%20format%20for%20Universal%20Payload%20Interface.pdf. Even though I helped create HOB's, I am one of the first to support moving to a more standard format, especially given the existence of work like https://github.com/intel/tinycbor. To me this is the same spirit of using known filesystem formats, etc.

When asked about this I often cite the economics of host firmware development. Imagine a 5,000 person host firmware producing community across open source participants, OEM's, ODM's, firmware vendors, first party device creators, etc.  Recall the supply chain picture from erstwhile 2015 CanSecWest prezo



As such, compare this to the application and OS producing community that is easily orders of magnitude larger. Given the developer community size disparity it is difficult to justify the R&D and validation expense for bespoke host firmware-only solutions. Given NIH sometimes in engineering communities I also quote "Good artists invent, great artists steal" Picasso  https://medium.com/ben-shoemate/what-does-it-mean-good-artists-copy-great-artists-steal-ee8fd85317a0.

And given challenges of HOB's I understand the critique of Terse Executable (TE) during gsoc thread "[edk2-discuss] GSoC Proposala replacement for the TE format (it’s buggy and most platforms mostly abandoned it over various issues)," I'm not sure where he received his telemetry on 'abandoned' for working systems. Scraping FD images from the internet? Similar to some of the weird history on those GSoC threads like "PEI and DXE in 1998" when the latter were created in early 2000's as derivatives of EFI into the earlier boot flow. This TE assertion is close to home given the TE image format signature has 'vz' as opposed to PE/COFF 'mz', as originally described in https://www.amazon.com/Beyond-Bios-Implementing-Extensible-Interface/dp/0974364908



So has the TE image been relevant in the industry? Let's do some rough back-of-the-envelope calculations. To begin, the TE image format is easily over 20 years old. In the spirit of the last blog on statistics I'd sat we started to scale Framework EDKII in 2005


Imagine in the ensuing years it's tough to calculate the number of PC's, servers, and clients that shipped using UEFI PI-based firmware, including #'s like https://www.statista.com/statistics/576151/unit-shipments-pcs-united-states/. For Windows8 UEFI was required so you can argue 100% at that point. 

To take a swag lets assume 250 million units / year over the last 17 years, yielding 4.25 billion units. Let's also assume the fabrication of the PEI phase pre-memory uses TE's and that each fv/ibb/fvrecovery has ~ 50 peim's compiled with TE. So that would yield the market has shipped 50 * 4.25  =  212.5 billion TE images into the market. And if platform updated 5 times during life, 1 trillion TE's may have been produced. So even if we go 'Beyond TE', 1T has been a good run. And I look forward to the next wave of firmware technologists' proposals - the critique is a pre-condition to invention/innovation - but the next wave needs to close the loop by proposing, documenting, coding, delivering, validating, and scaling their alternative. 

I still recall meeting one of the original developers of MS-FAT in Windows at a driver dev-con in the early 2000's in Redmond. He mentioned that 'writing the code was only 10% of the effort.' The latter class of activities were the remaining 90% for a professional software engineer. Or the old trope 'the last 10% of a project takes 90% of the time.'

Finally, these cardinality of the produced TE image set calculations remind me of a colleague at Cornell during my undergrad. He worked part-time in the astrophysics department, whereas I spent time modeling and mocking up particle beams for a professor who did microwave and plasma research as part of SDI https://en.wikipedia.org/wiki/Strategic_Defense_Initiative. My friend noted that the astrophysics researchers there would reckon calculations plus or minus a couple orders of magnitude. I won't argue host firmware is of the same scale but it does note the challenges in precision in various domains.

Sometimes folks believe in bespoke and not studying the past arc of history in tech. I recommend spending some time on bitsavers.org, for example, to folks. But it reminds me of the advice given to me by a retired manager - "Sometimes you see someone put their hand in the chipper. You can say 'there's a chipper, there's a chipper', but they ignore you. Once they are beyond their elbow you need to jump in, at least." Or the dual of child rearing, which was "Sometimes the young child needs to bump her knees on the coffee table while learning to walk."  For those not in the PNW a chipper is https://www.familyhandyman.com/list/best-wood-chippers/.

BTW
I just sampled some of the OCP Security https://www.opencompute.org/wiki/Security tech talks. Work to avoid e-waste such as ownership transfer https://docs.google.com/document/d/1oANhjvv_R7E5n8w1RroN8l8-0jdYlfdQDp_3RqGV66k/edit# for servers presented by Google datacenter engineer nicely complements their work on Chromebooks with their developer mode https://www.chromium.org/chromium-os/chromiumos-design-docs/developer-mode/ and other art to allow for unsupported devices to be over-flashed by community firmware for continued life & usage. Nice to see work to avoid aged hardware from only have a fate of visiting the refuse bin/salvage yard.

It might be a stretch to analogize re-use of ideas in this blog post and re-use of hardware in the circular economy from this OCP talk. The former stands on the shoulders of giants and leverages information ecosystems (knowledge, tools, remediated flaws) whereas the latter leverages extant hardware for appropriate use-cases. But interesting stuff nevertheless.

fin (98)



No comments: