Tuesday, September 24, 2024

Reflecting on my time at a tech company (aka 'Retiring from Intel')

I've queued up some blog drafts over the last couple of months but I haven't been able to generate the energy to finish them. They just didn't seem to have enough bulk to them.

So why posting now and with this 'new' content?  

Well, I want to share that I have elected to retire from Intel after 27.5 years. My last day will be September 30. While I'm moving on to the next chapter of life, I'll always cherish the time I spent at Intel. 

And in fact it is with no small amount of temerity I write this message, especially after receiving so many soulful and impactful farewell messages recently from Intel colleagues also opting into this retirement package.  I'm somewhat 'late' in penning my message, I'm afraid (at this point in time I haven't sent out the broad bcc'd "I'm leaving" email).  And then there's my all-time favorite parting message I captured from Sham at the end of the https://vzimmer.blogspot.com/2023/05/open-platforms-snapshot-may-2023.html posting that I could never hope to emulate. 

But emulate I won't. In fact, I'll write this as I do most of my postings, sort of a rambling message to myself; on this sentiment I'm apparently not alone given quote of another 1.5 decade blogger "I keep this blog for me to write, not necessarily for others to read." https://www.jonashietala.se/blog/2024/09/25/why_i_still_blog_after_15_years/. For this particular post I couldn't figure out where to insert a 'TL;DR' since I sometimes think that could be the title or theme of this whole blog series :) I only regret that I won't have a reason to author a successor to https://vzimmer.blogspot.com/2024/02/27-or-anniversarynext12-ai-runtime.html

So for more of the TL 'too long," rewind the clock 32.5 years to my first five years post-undergrad in industry prior to Intel.  In those early days of 1992 back in Houston I was introduced to BIOS and embedded firmware development using Intel technology, from the i8051 through i80186 … and culminating with the P6. Beyond the data sheets, I also immersed myself deeply in Intel driven specifications like PCI and I2O (although forgotten by PCI SIG and intel.com, many still live on at https://bitsavers.org/pdf/intel/). These experiences ranged from poring over the black cover data public tomes of data books to the yellow-cover NDA documents, while continually being intrigued by what was happening at Intel via reading reports on the company in print periodicals like EE Times; this was the early 90’s prior to the internet going big.

Who knows? Maybe some of the work I contributed to at Intel, whether papers or books or specifications such as mentioned in https://vzimmer.blogspot.com/2021/01/books-and-computers.html, might end up at bitsavers some day, say Beyond BIOS https://www.amazon.com/Beyond-Bios-Implementing-Extensible-Interface/dp/0974364908 will have a URL like the RMX book http://www.bitsavers.org/pdf/intel/iRMX/iRMX_III/Real-Time_and_Systems_Programming_for_PCs_1993.pdf?

Given my early exposure to Intel, imagine my delight in getting recruited by Intel to lead the development of Itanium firmware for the Merced CPU in late 1996 and joining the Intel High-End Server Division (HESD) in February 1997 in DuPont WA. The Intel recruiter told me that I could ‘go to Hillsboro for Xeon or Dupont for Itanium.’ I wasn’t familiar w/ any place in the PNW so the obvious choice was to join the Intel 64-bit wave! Prior to joining Intel, I still recall my Compaq manager saying when I served notice “I guess you’re going to Portland Oregon” when in fact I was heading to Washington state. Commencing in ’97 I was now part of the mission to help create the technology behind those great products and standards I’d admired so much.  

Since then, I truly realized the saying of Steve Jobs 'The only way to do great work is to love what you do,' and I've truly loved working alongside such talented and dedicated individuals in this work. That was the missing link from my pre-Intel days, namely the broad experience with Intel employees.

Speaking of people and technology and standards, now more than half my life, or these last 27.5 years at Intel, have more than exceeded my hopes, but it’s the people with whom I’ve collaborated, learned, and grown I appreciate the most.  Thank you all for creating a positive and inspiring work environment. From co-creators of the SAL+NuBiOS & SAL+AMI ‘Salami’ firmware for Merced in HESD, the Workstation Product Group (WPG) Kittyhawk native C code that booted Intel P3 on 840 Rambus and Merced 460GX w/ either the AMI 630 ‘furball’ or the EFI sample as the late-stage payloads. Then off to Microcomputer Software Lab (MSL) in MD6 to work on the hit series of scaling EFI from 0.92 to today’s UEFI 2.11, along with “Tiano” that yielded EDK->EDKII and the Intel Platform innovation Framework for the Extensible Firmware Interface (e.g., “Framework”) specifications that have become the UEFI Platform Initialization (PI) specifications of today. This latter work spanned orgs from MSL to EPG to SEG to SSG to SATG to DEG to my final home here in CCG. I guess the only platform group I missed was embedded, although I enjoyed collaborating with those folks from ACSFL in the late 90’s to today’s slim bootloader.

I suppose I can date the badges by BDE or ADE ('Before drop-e' or 'After drop-e') https://vzimmer.blogspot.com/2014/01/advances-in-platform-firmware-beyond.html.

It’s open source platform code like slim bootloader, coreboot, and EDKII features/platforms that have occupied the last 10 years of scaling the Firmware Support Package (FSP), ….. along with the primary mission of FSP to have a clear business boundary between Intel owned versus customer codes. With this last decade also including contributing to NIST 800-193 platform firmware resiliency and recovery.  And and and ….

...and booting.  Measured boot, UEFI Secure boot, ipv6 boot/netboot6, HTTP boot, boot-from-Wifi.....Sometimes I'd use 'booting from a sneaker' as a variant of the Toaster or Fabrikam sort of pedagogic fake device, but given Bluetooth and smart accessories/shoes I suspect this one will fall into the 'life imitates art.'

And I could take a whole detour on security and friends long past. Someone said I was the final member of the below bench to exit. John of PSIRT, Yuriy of threat research, Kirk of all-things-SMM security, ... Zimmer as the UEFI security guy. I still recall a colleague saying 'bring boxes of the Intel Press Beyond BIOS and Shell books. The visitors will love them.' Given the muscle ache from both lugging them down to Portland and back to Seattle I couldn't help but think of the Harold Ramis quote in Ghostbusters that 'print is dead.'  Even those many years ago no one wanted those bulky dead-tree texts.


Beyond the tech milestones, I still recall a few words of wisdom from a now-retired colleague. One was ‘the best architecture is sometimes knowing what to leave out’ (I heard it but didn’t necessarily always practice it) and the other was ‘I don’t know why people don’t get it, but BIOS can be a great career.’  And a great Intel career it has been. Another was ‘the higher leadership ascends you’ll find the more impactful decisions they have to make with successively less information.’  So my take away is that you should take it easy on the bosses, especially in tough times.

And there is my 3-tuple of advice I sometimes give others and myself:  ‘business first, team second, and career third.’  To me this means focus on the business priorities first, even if they transcend your team’s charter. Next help develop and foster a strong team environment for the mission to collaborate on these business challenges.  And a distant third is your career.  I don’t mean to imply career growth is unimportant but more that if you focus on the business priorities and the team, a well-managed company will acknowledge your efforts.  

Also, observe where the interesting problems are being worked and good team cultures exist. Given that insight, when given the opportunity to engage in such focus areas and collaborators it may help your career long term.  And 'keep learning.' This may sound a bit strange coming from me since a boss recently said ‘...and if you don’t want to keep learning then just “retire”.  I personally hope to do both, but the exhortation to 'keep learning' is golden irrespective of one's employer or employment state or age or.....

And given this is a wrap-up sort of blog, I've probably repeated a few themes mentioned before. Some are quite important, though, such as 'it's the people that matter.' Projects and tech come and go. The people are the key invariant of value. For example, sometimes folks think I get excited by books and patents, but it's the co-authors and co-inventors that thrill me. I may forget a book chapter or set of independent claims, but I'll never forget the rich set of colleagues with whom I toiled shoulder-to-shoulder on these endeavors. And these endeavors match my triad of biz/team/career in that they were all done to help further a business strategy, secondarily they entailed team collaboration (sometimes co-authors outside of team or company), and at the end of the day, they may have helped (or hindered) my career arc. As long as I hit #1 and #2, though, I'm at peace.

Other wisdom? Don't bash other technology. I still regret writing 

twenty years ago in  https://www.researchgate.net/publication/377810413_TechnologyIntel_Magazine_-_Advances_in_Platform_Firmware_Beyond_BIOS_and_Across_all_Intel_R_Silicon. You win by being good, not by belittling the competition. And the fact that the PC industry for 20+ years had shipped on this 'monolithic', 'space constrained' BIOS rebutted my argument And to be honest, Tiano in 2004 wasn't the exemplar of software quality and stability.

I find a kindred soul in Prof G's advice that 'work life balance is a myth' https://www.raconteur.net/talent-culture/scott-galloway-work-life-balance-work-from-home but the part I perhaps erred on is ignoring the qualifier 'when you are young.' I have kept this unbalance through 3+ decades :) But it has been a great trip and I can see doing more when there are opportunities to dent some more https://www.goodreads.com/quotes/950437-we-re-here-to-put-a-dent-in-the-universe-otherwise.

I not sure what the next phase of the journey will be, but I couldn't help but laugh when reading this cartoon from the New Yorker recently. I sort of put my own spin on it, although some may say it reads well in its original.


And I sure have quite a reading backlog to attack (see background of posts like https://vzimmer.blogspot.com/2021/11/books-old-age.html). 

Regarding timing of this event, my Fidelity advisor said 'you can retire but there is the risk of you getting bored.' And a retiring Intel security Fellow opined 'you are too young to retire.' In retrospect I realize that I may be a bit junior to many of the 'retirement' cohort I see exiting since I dove head-first into tech w/o MS+PhD or military or ...et al hang-time. But given the exponential arcs of so much happening in tech and the richness of the world, I suspect I can find many a palliative to the specter of boredom (more 'dent' opportunities - see above).

Speaking of 'fellow,' that was definitely a milestone I had hoped to achieve in my quarter-century tenure at Intel. I try not to be sour grapes and think of the externally-hired-in fellows who only had to align with Professor Galloway's 'it's easy to fall in love with someone for an hour' when comparing external versus internal promotions. Instead I'd say Intel offered many open doors for me and perhaps I simply stumbled into the door jam? It was never aspiring toward the fellow role just for the sake of the title. Instead, I view achieving a fellow promotion as both an acknowledgement of the observed fellow-level impact plus the ability to have more insight into and ability to help advise the business (i.e., a bigger platform to help make those 'dents in the universe').

Regarding that out-of-reach cohort, I did have a chance to leave a small mark for system software next to the Fellows and Senior Fellows, as chronicled in https://vzimmer.blogspot.com/2021/07/patents-and-co-inventors.html and https://vzimmer.blogspot.com/2022/09/new-milestones.html. Recall the century-milestones I related of:

From https://levels.fyi

If not fellow, I have at least tried to level up to my 'Senior' taxonomy this year, though, by applying for senior member status of the ACM https://awards.acm.org/senior-members/award-recipients?year=2024&award=159&region=&submit=Submit&isSpecialCategory= 

and the IEEE, respectively


I just made it into 'senior member' under the 30 year milestone of my time with IEEE, for example. So I'm exiting this tech company as a pure-play 'senior' (e.g., Intel Sr. PE, Sr. member ACM, Sr. member IEEE), it seems. What's next on the 'senior' theme?  More senior moments undoubtedly, sliding into senior citizen-hood, ....?

So now to prepare for the next months. One colleague who left from another tech company years ago into Intel told me it took him 2 years to get over leaving his last shop. And another colleague who left Intel for a FAANG company a couple of years ago told me that you fade away quickly from people's memories at Intel, easily within 2 years (2 mos., 2 days, 2 hrs?). So I guess the overlap is 'getting over' job.last and being forgotten by colleagues.last :)

Time.  Time.  As I sit on 12 weeks of accumulated sabbatical (closing in on 16) & a vacation free recent couple of years, I suppose the universe with this 'enhanced retirement package' has finally figured out a way to make me close my Intel laptop lid. And close it I shall. 

In closing, my personal tell is that once I’m done with the meat of a conversation I start philosophizing too much.  And on that note it’s time to end this conversation since my philosophizing has eaten the word budget on this post more than usual.

Thank you all and good-bye,

Vincent

PS if you ever need to contact me, my info is at the top of https://sites.google.com/site/vincentzimmer/cv


Sunday, June 30, 2024

500k

500k.  An interesting milestone. This figure comes from the Springer-Verlag site https://link.springer.com/book/10.1007/978-1-4842-0070-4. I was asked by a colleague how many of the free Kindle copies have been downloaded from Amazon https://www.amazon.com/Embedded-Firmware-Solutions-Development-Practices-ebook/dp/B01JC1LDTY and I didn't have any idea.  Probably a multiple of this number given the paucity of free books in this category?

Either way, the milestone generates a few thoughts. One is a reminder writing technical books isn't about generating large incomes from their sales. A recent Hacker news thread https://news.ycombinator.com/item?id=40830332 and its associated article https://architectelevator.com/strategy/book-author-economics/ are a reminder of this.

"My motivation for writing the book was never the money, and I've generally treated the royalties as a nice bonus. I started writing because I cared a lot about the technology, and I wanted to share it with other people. Writing the book was my way of contributing something to a community that I'd benefited from a lot in my career." 

Another memory includes a dual perspective to the 'open platform blog' http://vzimmer.blogspot.com/2023/05/open-platforms-snapshot-may-2023.html, namely the binary dimension. If you recall from that posting I cite the open source presentation that included the line "Minimize IP components in binary like Intel FSP." So the FSP evolution was always the binary portion of having the open platform code based full solution. A rough roadmap of this work leading up to 2022 can be found in https://link.springer.com/chapter/10.1007/978-1-4842-7939-7_5


As a refresher, in 2014 we were faced with how to support platform code of both coreboot https://www.intel.com/content/www/us/en/developer/articles/tool/coreboot.html and EDKII-ilk https://www.intel.com/content/www/us/en/developer/articles/tool/unified-extensible-firmware-interface.html. The proposal of the multi-division working group I started then included the approach show in https://www.intel.com/content/dam/develop/external/us/en/documents/sf14-stts001-820295.pdf.


The IOT division (now NEX) was already leaning in to using FSP but they mixed the SOC specific details and the API. One of the first things we did was to split out the interface from the SOC-specific implementation. This led to the series of FSP External Architecture Specifications (EAS) found at https://github.com/intel/fsp/wiki and the 'integration guides' found on https://github.com/intel/fsp, such as https://github.com/intel/FSP/blob/master/RaptorLakeFspBinPkg/Docs/AlderLake_FSP_Integration_Guide.pdf

As part of the journey of making community based development less difficult, I was able to clean up the license of the FSP from a 10-page click-through to a simple one based upon the microcode license https://www.phoronix.com/news/Intel-Better-FSP-License https://mail.coreboot.org/pipermail/coreboot/2018-August/087220.html

With FSP2.0 we introduced the FSP-T, FSP-M, and FSP-S to support the non-memory mapped boot map of Apollo Lake (a topology described in https://cdrdv2.intel.com/v1/dl/getContent/671281), and 2.1 introduced dispatch mode for easier integration in a native EDKII environment. The original way to interface with the Intel FSP used by coreboot and slim bootloader is called API mode.

All along the way the FSP's themselves were based upon a mixture of closed source EDKII style silicon code and open source EDKII infrastructure, as exemplified by the https://github.com/tianocore/edk2/tree/master/IntelFsp2Pkg.

So you will see that the timeline above from the 2022 book stops with FSP2.3.  Since then we dropped the FSP 2.4 specification. 2.4 was a pretty radical change to FSP that added things like 64-bit support, SMM encapsulation, cooperative state storage, and additional multi-phase. These FSP changes were part of the broader Universal Scalable Firmware (USF) effort https://universalscalablefirmware.github.io/documentation/8_scalable_fsp.html#sfsp-interactions.  

USF https://www.intel.com/content/www/us/en/developer/articles/technical/universal-scalable-firmware.html was for a while called 'SubZero' to compose as part of the larger oneAPI effort publicly discussed by Raja at https://www.anandtech.com/show/15990/hot-chips-2020-live-blog-intels-raja-koduri-keynote-200pm-pt 


(BTW - this hierarchy also explains the challenges in writing a firmware technical book)


Idea was to have a 'sub zero' or 'level -1' as distinct form the level 0 device driver work of oneAPI https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html#gs.7vlscg

The USF stack entailed breaking up the specific concerns of SOC, platform, and boot technology, as shown in figure 


https://universalscalablefirmware.github.io/documentation/1_terminology.html. And unlike the 2014 IDF presentation that just showed FSP supporting coreboot and EDKII, USF vied to support additional platform code technologies, such as https://github.com/slimbootloader/slimbootloader and even the pure-Rust based https://github.com/oreboot/oreboot, at least until the latter removed their FSP support https://github.com/oreboot/oreboot/tree/remove-vendorcode-fsp in order to keep the project based purely on open sources.

This narrative isn't just my perspective. J. Zhang from Meta had written the following 


 in https://link.springer.com/book/10.1007/978-1-4842-7939-7


It's interesting that parties outside of my company use 'OSF' (i.e., Open Source Firmware) acronym a lot that I'm sometimes surprised in that I rarely if ever hear the term within the corporate walls. 

To me the important part of doing USF was the openness, including POC's and specification drafts at https://github.com/universalscalablefirmware. For example, we fabricated the FSP 2.4 changes for 64-bit at https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_x64, YAML-based configuration (versus bespoke BSF) https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_yaml, SMM encapsulation in FSP https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_x64_smm (originally inspired by https://www.intel.com/content/www/us/en/content-details/671459/a-tour-beyond-launching-standalone-smm-drivers-in-the-pei-phase-using-the-edk-ii.html), and a 'bootable FSP' or FSP@Reset or 'FSP-R' https://github.com/UniversalScalableFirmware/fspsdk/tree/qemu_fsp_at_reset.

During the early days of FSP we didn't just document the platform code usage to 'consume' the FSP but we also explicated how to 'produce' and FSP (i.e., the type of recipe used in the FSP QEMU instances above) https://www.intel.com/content/www/us/en/content-details/671448/tour-beyond-bios-creating-the-intel-firmware-support-package-version-1-1-with-the-edk-ii.html 

And now in 2024 the bottom 3 of the collaborators are at Microsoft. Quite the change over time.

Additional information on USF can be found at https://github.com/UniversalScalableFirmware/Introduction/blob/main/USF_Overview.pdfhttps://www.osfc.io/2021/talks/an-evolutionary-approach-to-system-firmware/, and https://uefi.org/sites/default/files/resources/USF_Security_Webinar_Final.pdf.

We even described about how to have shareable C code, the predominate language of EDKII, coreboot, and slim bootloader, with Rust https://universalscalablefirmware.github.io/documentation/3_platform_orchestration_layer.html#shareable-platform-code-rust-binding-api

Speaking of Rust, the recently published https://techcommunity.microsoft.com/t5/surface-it-pro-blog/surface-uefi-evolution-in-boot-security-amp-device-management-to/ba-p/4159998 on MS  Rust support generated a few questions to me recently. I’m a fan of moving firmware into Rust in addition to other defense in depth (isolation, ISA mitigations, etc). We did an initial integration of Rust into EDKII https://github.com/tianocore/edk2-staging/tree/edkii-rust 5 years ago described in https://uefi.org/sites/default/files/resources/Enabling%20RUST%20for%20UEFI%20Firmware_8.19.2020.pdf  and https://cfp.osfc.io/media/osfc2020/submissions/SLFJTN/resources/OSFC2020_Rust_EFI_Yao_Zimmer_NDK4Dme.pdf.  We also provided guidance on Rust for firmware in one of our book chapters https://link.springer.com/chapter/10.1007/978-1-4842-6106-4_20

There is also the camp of using 'modern C++' as another memory safe language like Rust https://www.cisa.gov/news-events/news/urgent-need-memory-safety-software-products for systems programming. I'm open to smart pointers and other idioms of those applied to firmware, but the same issue of the 'unsafe UEFI protocols' with their raw pointers will have the safety scoped to only the interior of PEIMs, DXE drivers, UEFI drivers, and UEFI applications, respectively. 

The tianocore community ended up not pushing the Rust work into EDKII upstream for various reasons (people/value/feedback), including no one wanting to invest in the EDKII build system and drive an integration like this. Later work with Google Summer of Code yielded getting the UEFI Rust Crate up streamed https://crates.io/crates/uefi. This allows for building stand-alone .efi images with this crate and including the resultant binary into EDKII full firmware integration.  This latter approach allows community to leverage the goodness of the Rust ecosystem that is vibrant/supported/growing – Cargo, libraries of crates, auto test and doc generation, etc – and avoid some of the vagaries of the EDKII native build system.

In addition to the API changes, the provenance of firmware was a design point. As such, we created the https://www.intel.com/content/www/us/en/content-details/644001/content-details.html specification to describe how how to create manifests and measurements for the FSP and do the corresponding work for the Universal Payload (UPL) https://universalscalablefirmware.github.io/documentation/5_security.html#universal-payload-measurement. UPL is another aspect of the USF work that provides interoperability between how to boot, whether a UEFI style boot with the EDKII payload package, LinuxBoot, or an embedded hypervisor or RTOS. This type of layering for a very diffuse supply chain is akin to attempts like https://android-developers.googleblog.com/2017/05/here-comes-treble-modular-base-for.html. Just as the Android userland should be platform independent, there is a similar demarcation in UEFI where the bulk of the DXE drivers for UEFI compatibility is platform independent https://github.com/tianocore/edk2/tree/master/UefiPayloadPkg, with the same argument holding for a more generic Linux kernel for LinuxBoot https://www.linuxboot.org/.

Speaking of FSP 2.4, in postings the 64-bit work gets a call-out from Google in https://www.phoronix.com/news/Chrome-64-bit-Firmware-Adapt https://blog.osfw.foundation/chrome-ap-firmware-adopting-to-x86_64-architecture/. It still feels like yesterday when I coded up the first PEI code https://github.com/tianocore/edk2/tree/master/MdeModulePkg/Core/DxeIplPeim code to transition to a 64-bit DXE from a 32-bit PEIM 20 years ago. Given our small amount of cache-as-RAM at the time it seemed otherworldly to imagine moving both PEI and DXE to 64-bit at that time, so we opted for the 32-bit PEIM and 64-bit DXE we have had up to today. I also recall looking at the sample code of the AMD64 data book at the time to inspire some of this machine transition code creation. 

Although most of the posted FSP's are client and microserver at https://github.com/intel/fsp

big core Xeon is joining the list. 

Specifically the use of FSP for Xeon gets mention in https://www.phoronix.com/news/Bytedance-CloudFW-Open-Source https://bytedance.larkoffice.com/file/boxcnIHvljaKfN2EaEr0H2ZMzyg and has made progress with https://github.com/intel/FSP/tree/master/EagleStreamFspBinPkg and associated open source platform code at https://github.com/coreboot/coreboot/tree/main/src/soc/intel/xeon_sp, including the Eagle Stream mentioned above https://github.com/coreboot/coreboot/tree/main/src/soc/intel/xeon_sp/spr and the upcoming GNR https://github.com/coreboot/coreboot/tree/main/src/soc/intel/xeon_sp/gnr. The spr coreboot workflow has a nice overview at https://www.intel.com/content/www/us/en/content-details/778593/coreboot-practice-on-eagle-stream.html, too.

AMD has been working on open sourcing coreboot code for their Epyc servers, with https://github.com/coreboot/coreboot/tree/main/src/soc/amd/genoa_poc that leverages the https://github.com/openSIL/openSIL libraries (sort of like an open source variant of Intel's FSP-S code) and binaries posted https://github.com/amd/firmware_binaries/tree/main/genoa/PSP

Open source platform code is interesting. It may offer sustainability options, such as creating your own firmware for a decommissioned server board, or one for which ownership has been transferred. The concept of ownership transfer can be found in work at the OCP https://www.opencompute.org/documents/ibm-white-paper-ownership-and-control-of-firmware-in-open-compute-project-devices. This type of sentiment of part of the circular economy thinking.

Speaking of servers, I joined Intel to lead the 64-bit Merced firmware. We launched EFI on that platform but built upon SAL and PC/AT BIOS. Afterward when Tiano and the Framework-based EDK code was developed for a full platform initialization, I was asked to lead getting the first IA32 Xeon product to adopt the technology. It was the Blackford https://ark.intel.com/content/www/us/en/ark/products/27746/intel-5000p-memory-controller.html chipset-based platform. There was immense push-back from the internal teams to EDK and EFI in general. Originally we thought servers would embrace EFI for use-cases like provisioning, etc., but it turned out servers were the most conservative product category at often last to change.

If you made it this far I apologize. This is the type of blog you get when I camel up a lot of thoughts and don't commit to a final draft, I suppose, for some months. And to continue the meandering, one other sentiment that the above history of crafting firmware specifications reminds me of is how informal, semi-formal, and formal techniques can be applied to this domain going forward. I was reminded of this imperative by the quotation:

"If you’re a software engineer, especially one working on large-scale systems, distributed systems, or critical low-level system, and are not using formal methods as part of your approach, you’re probably wasting time and money. Because, ultimately, engineering is an exercise in optimizing for time and money1." https://brooker.co.za/blog/2024/04/17/formal.html  

I often tell folks that engineers are like applied economists.  Sufficient outcome for the lowest cost. This a another trope along with my 'business/team/career' hierarchy of importance I often quote.

And speaking of another Seattle data point beyond Amazon's Brooker quotation above, I am sad to see that the computer history museum I mentioned 6 years ago http://vzimmer.blogspot.com/2018/06/ is going away https://www.geekwire.com/2024/seattles-living-computers-museum-logs-off-for-good-as-paul-allen-estate-will-auction-vintage-items/. UW hosted an event at the museum after Allen's donation ended up renaming the school in his name. Sadly he passed away a few months later. With the following COVID and settling of his estate, it appears that the museum is a victim of the times.

On a brighter note, I was happy to see another local, Microsoft's Dave Thaler https://www.microsoft.com/en-us/research/people/dthaler/

appear in the eBPF https://ebpf.io/ documentary https://www.youtube.com/watch?v=Wb_vD3XZYOA. I worked with Dave in the late 2000's on evolving UEFI network boot to IPV6 https://www.rfc-editor.org/rfc/rfc5970.txt https://www.ietf.org/archive/id/draft-zimmer-dhc-dhcpv6-remote-boot-options-01.txt.  He looks largely the same as when we were drafting the RFC in his MSFT office or co-presenting at some IETF session. I wish I could say the same about myself. And of course the other notable figure from that documentary who now works at Intel and with whom I had the chance to collaborate  https://patents.google.com/patent/US20240143341A1/en

is the compute performance guru Brendan Gregg https://www.brendangregg.com/. Given his office in Australia I am dubious about f2f co-work opportunities, though, as I had with Dave.

Well, enough for June. Here's looking forward to some thoughts in the upcoming months.

PS
I still need to reconcile my usage of other sites versus blogger. I snapped a couple of conversations since I think the free/community version of Slack removes content after some time window (90 days?).

Specifically, here are some responses I posted on the OSFC slack channel in response to queries, viz.,

https://app.slack.com/client/T0RASQBGW/C9ZLS0U4F


I can understand your confusion.  The answer is mostly #3.Per your question - the typical model is for a hardware root of trust (Intel BtG, AMD PSP, etc) to verify the firmware volume w/ SEC+ PEI code, or "Initial Boot Block" (IBB) via a hash comparison.  Then the IBB code has a library to do verification of the OBB via another hash comparison via code like https://github.com/tianocore/edk2/blob/master/SecurityPkg/FvReportPei/FvReportPei.c.  The OBB is another firmware volume.  The OBB contains DXE and the UEFI Secure boot logic.  The code in the OBB then validates 3rd party UEFI drivers in option ROMs and UEFI images on disk or network via assymetric crypto verification of the Authenticode-based signed PE's.  You can see all of this put together in https://tianocore-docs.github.io/Understanding_UEFI_Secure_Boot_Chain/draft/secure_boot_chain_in_uefi/boot_chain__putting_it_all_together.htmlPS
The UEFI Spec and its 'Secure boot' (really a mistake made by some folks marketing windows.  The 'secure boot' section was about network auth protocol and the pe/coff signing really didn't get read in until https://uefi.org/specs/UEFI/2.10/32_Secure_Boot_and_Driver_Signing.html#uefi-driver-signing-overview).  In general it was a booboo to even call 32.1 'secure', but that's a sin of decades past now.Also, I originally hoped to do per PEIM and per DXE validation as noted 20 years ago in https://www.researchgate.net/publication/377810413_TechnologyIntel_Magazine_-_Advances_[…]atform_Firmware_Beyond_BIOS_and_Across_all_Intel_R_Silicon with sentence "The Framework and EFI drivers may optionally be
cryptographically validated before use to ensure that a chain of trust exists from power-on until the OS boots and
beyond."  Framework https://www.intel.fr/content/dam/doc/product-specification/efi-driver-execution-interface-dxe-cis-specification.pdf was the name of PI specs before they were donated/std'ized in UEFI Forum as the Platform Initialization (PI) specs.  The thinking was PEIM and DXE binaries could be sourced from different vendors, whereas today most people build their PEI and DXE from source.  It's the UEFI drivers and Apps that are ingested as 3rd party binaries given the different between OEM's (PI code), IHV's (adapter card UEFI drivers), and OSV's (OS loaders) in the supply chain.


https://app.slack.com/client/T0RASQBGW/C0RAR7JRM



The UEFI PI spec defines a dependency expression (depex) https://uefi.org/specs/PI/1.8A/V2_DXE_Dispatcher.html#dependency-expressions section in the firmware file or a PEIM or DXE driver that has an RPN encoding of the ppi or protocol consumed by a module.  The PEI and DXE cores use the depex to see if the required PPI's or Protocols have been published prior to dispatching a PEIM or DXE driver.That's the standards side.  On the code side, the EDKII implementation .inf consumes and produces are not used to generate the dependency expression. The .inf file for a given module has the expression under the '[Depex]' portion of the file https://tianocore-docs.github.io/edk2-InfSpecification/draft/2_inf_overview/215_[depex]_section.html#215-depex-section.  These are manually created since the developer can conditionally depend upon other ppis/protocols (imagine control flow based upon some platform state such as a GPIO asserted that tells code whether or not to invoke some 'recovery' PPI/protocol).  That's why you see things like "SOMETIMES_CONSUMES" in files like https://github.com/tianocore/edk2/blob/master/SecurityPkg/Tcg/Tcg2Dxe/Tcg2Dxe.inf

Saturday, March 30, 2024

A legend passes

Sad to see the news about Ross Anderson https://en.wikipedia.org/wiki/Ross_J._Anderson passing

https://alecmuffett.com/article/109513

https://news.ycombinator.com/item?id=39864210

https://twitter.com/duncan_2qq/status/1773752269395099774 


Like many I was inspired and informed by his various editions of the "Security Engineering" book https://www.cl.cam.ac.uk/~rja14/book.html. I also explored the domain via papers like https://www.cl.cam.ac.uk/~fms27/papers/1999-StajanoAnd-duckling.pdf that I referenced in https://www.researchgate.net/publication/221199899_Platform_Trust_Beyond_BIOS_Using_the_Unified_Extensible_Firmware_Interface/references. I also cull wisdom from papers like https://www.cl.cam.ac.uk/~rja14/Papers/satan.pdf since having worked on the boundary of software and hardware for so long, sometimes errant hardware or firmware is truly an embodiment of 'Satan's Computer.'

My small interaction with Prof Anderson was during the writing of https://link.springer.com/book/10.1007/978-1-4842-6106-4

 


 My co-author and I reached out to see if Anderson would write a forward, with the below response


Luckily we did get a very insightful write-up


from Leendert Van Doorn https://blog.paramecium.org/about/.

This was an ironic pairing in retrospect seeing Anderson's critiques of Trusted Computing and Leendert's contemporary contributions to that domain, respectively. Having these titans both critique https://www.cl.cam.ac.uk/~rja14/tcpa-faq-1.0.html and build a domain https://www.amazon.com/Practical-Guide-Trusted-Computing/dp/0132398427, like TCPA (now TCG https://en.wikipedia.org/wiki/Trusted_Computing_Group) Trusted Platforms Modules, represent a healthy aspect of technology evolution in my view. Differing views make any technology stronger, versus groupthink & homogeneity of thought.

 Sad times for the security community, though, with the loss of a legend.





Sunday, March 24, 2024

Sneers, CNAs, licenses, and fuzzing

Let's start off with something I occasionally see in industry, namely 'the grand sneer' mentioned in https://buttondown.email/hillelwayne/archive/know-of-the-right-tool-for-the-job/. I sometimes see the 'sneering' if often a sign of youth or narrow experience or not exploring outside of your domain or https://twitter.com/vincentzimmer/status/1762972464169296002... 

The more you know often leads to greater humility borne of realizing how much knowledge there is in the world that you don't know.

Another interesting posting of late was the fact that the Linux kernel is now a CNA https://amanitasecurity.com/posts/dear-linux-kernel-cna-what-have-you-done/ https://news.ycombinator.com/item?id=39627302. I noted that there are similar challenges in other open source infrastructure like https://github.com/tianocore/tianocore.github.io/wiki/Reporting-Security-Issues in https://twitter.com/vincentzimmer/status/1768351312205484380

Another posting in that thread clicked into the SBOM topic with an advocacy for the VEX format. Some work in this space can be found in https://github.com/hughsie/uefi-sbom-best-practices/blob/main/index.rst, too.

So a lot of these thoughts are borne of experience. Amazon has a famous quote that goes something like "there is no compression algorithm for experience," but I'd have to say things are getting pretty good with LLM's. In fact I am glad that my longer form works were published prior to chatGPT.  Maybe the world of text will be bifurcated into BG and PG - "Before GPT" and "After GPT."

I don't subscribe to the dystopian 'paperclip' https://cepr.org/voxeu/columns/ai-and-paperclip-problem style apocalypse of AI but I do admire the foundations upon which these large foundation models are built, namely the sum of human knowledge, or the internet. From the hockey-puck style growth of the net in '97 from the Metacrawler era http://vzimmer.blogspot.com/2021/01/memories-from-uw-and-cornell.html to today's corpus of information on the web, it's truly staggering.

Some examples of oopsies around folks leveraging chatGPT a little too much include https://www.sciencedirect.com/science/article/abs/pii/S2468023024002402 https://simonwillison.net/2024/Mar/15/certainly-here-is-google-scholar/ and https://news.ycombinator.com/item?id=39733605.

Speaking of experience, Subrata made a nice posting https://twitter.com/abarjodi/status/1771948383529247011



namely the "FSP Customization - Remove non-mandatory components in the Intel FSP" for the Open Source Firmware Foundation (OSFC) Byte talks - volume 1, March 8, 2024 https://opensourcefirmware.foundation/events/bytetalks-vol.-1/. The video is now posted at https://www.youtube.com/watch?v=0ciYjPSu56A. This builds on work trying to help the various communities https://www.phoronix.com/news/Google-Intel-More-FSP-Flexible

 https://blog.osfw.foundation/breaking-the-boundary-a-way-to-create-your-own-fsp-binary/. In the past, we responded to the concerns about FSP licensing described in https://www.phoronix.com/news/Intel-Better-FSP-License 

https://mail.coreboot.org/pipermail/coreboot/2018-August/087220.html 


It's hard to 'sneer' when the community is seeing problem statements not necessarily experience in your own environment or workflow. 

Sometimes folks don't sneer but ignore. For example the use of SIMICs https://github.com/intel/tsffs for fuzzing firmware mentioned in https://twitter.com/jerry_Intel/status/1762220373503005056 regrettably didn't cite https://ieeexplore.ieee.org/document/9218694 in their blog https://community.intel.com/t5/Blogs/Products-and-Solutions/Security/Chips-Salsa-This-Hardware-Does-Not-Exist/post/1572067. I ordinarily wouldn't call folks out if it weren't for the fact that in an internal presentation of their work I mentioned the preceding development on UEFI SIMICS fuzzing and the ensuing paper to the TSFFS folks, with a response from the TSFFS lead that "Oh yes, we leveraged that work.  We were disappointed that you published first so that we couldn't." So at least not a sneer :)  

On a more positive note, the team did some great evolution, including extending 'beyond BIOS' use-case, getting it open source, and finally, against many odds within large companies enamored of Python et al these days, evolving the feature to use the Rust language. 

And additional props go out to my former software division that delivered TSFF to the open source for their work in evolving HBFA https://github.com/tianocore/tianocore.github.io/wiki/Host-Based-Firmware-Analyzer with their https://github.com/intel/HBFA-FL project. They did a nice job on ack'ing the earlier work, too https://www.intel.com/content/dam/develop/external/us/en/documents/intel-usinghbfatoimproveplatformresiliency-820238.pdf



Although a lot of the constituent elements like https://github.com/S2E are in the open, I wasn't able to get the symbolic execution work described in https://www.usenix.org/conference/woot15/workshop-program/presentation/bazhaniuk across the open source finish line. The lure of retirement, Amazon, and Eclypsium ended up disbanding that team over time and no new team emerged from the ashes to carry it forward. 



Saturday, February 24, 2024

27 or Anniversary.Next^12, AI, Runtime

Anniversary

True to form, today is my work anniversary. I started Intel on February 24, 1997 This post also builds on my last posting in this vein http://vzimmer.blogspot.com/2023/02/26-or-anniversarynext11-and-wisdom-of.html. At this point I have spent more than half of my life on this planet at this single company.  

Since the last posting I've been back to the office daily, sometimes visiting the Crossroads for lunch, where I had my first sighting of a Cyber Truck.



I'd often work through lunch and eat from the local self-serve facilities but the pickings have been slim. So slim that even a 2-month-expired tuna sandwich was eaten by someone.




 I last took a sabbatical in 2011 where I smashed up my arm on the 2nd day and ended up w/ 2 surgeries. I still recall the one-handed typing away at the ITJ articles https://www.intel.com/content/dam/www/public/us/en/documents/research/2011-vol15-iss-1-intel-technology-journal.pdf that appeared later that year.  Typically sabbaticals expire but with the strangeness of COVID there were extensions that turned into a revised evergreen policy where sabbaticals no longer expire.  Regrettably, though, one stops accruing time after reaching 16 years. Looks like I need to pull the trigger on at least one month in the upcoming year, viz.,

 

Beyond cyber trucks, old sandwiches, and sabbaticals, the last year has seen a lot of energy around AI. Even the University of Washington lecture series has been dominated by this topic. An interesting talk from Meta was included in https://www.ece.uw.edu/news-events/lytle-lecture-series/ with slides https://www.ece.uw.edu/wp-content/uploads/2024/01/lecun-20240124-uw-lyttle.pdf and a recording https://www.youtube.com/watch?v=d_bdU3LsLzE. You can catch me at 6:36 on LHS of the screen


It's always interesting to be there in person. One comment that intrigued me was a comment from Yann LeCun about the Metaverse build out and GPU procurement. I cannot recall if it was on-tape or off-the-record so I've avoid going further than that.



This talk was at the Lyceum HUB but all of the other talks were across the street.



Another interesting talk https://www.cs.washington.edu/events/colloquia/details?id=3310 was from Fei-Fei Li of Stanford and ImageNet https://ieeexplore.ieee.org/document/5206848 fame.
This was near the comp sci building

LeCun's and Li's talks were like Taylor-Swift-concert-crowds but for techies. Nice to see these thought leaders share their insights and experience in person. 

Google Waymo was also in the mix with a talk https://tcat.cs.washington.edu/events/taskar-center-memorial-distinguished-lecture/ on the challenges of autonomous driving and AI https://www.youtube.com/watch?v=pK5ChzMsfE0


So let's pivot from AI lectures to a topic closer to home, namely firmware runtime. I chose this topic since one of the most popular posts is on this blog series was UEFI variable runtime http://vzimmer.blogspot.com/2012/12/accessing-uefi-form-operating-system.html. I suspect this stems from the fact that most folks have interactions during life of platform at runtime. In our quest for 0-second boot this makes even more sense. 

Regarding how firmware is exposed to the operating system, and thus ultimately the user, at runtime, there are various modalities. 

These include:

  • Static info tables
    • Advanced Configuration and Power Interface (ACPI)
    • Flattened Device Tree (FDT)
  • Interpreter bytecode
    • ACPI AML interpreter
    • UEFI EBC sandbox
    • x86 VDM for video int10h calls
  • Native code runtime
    • UEFI runtime
    • Power9 OPAL
    • Platform Runtime Mechanism (PRM)
  • Opaque host modes - synchronous and asynchronous activations

    • SMI(SMM)- x64
    • SMC(TrustZone)- ARM
    • Trap(Machine Mode) - RISC-V

These are all for the host firmware, or firmware running on the main application processor/core.

There is also device firmware running in the SOC.

In addition, at the platform level, the host can signal 'non-host' such as an Embedded Controller (EC) on client and Baseboard Management Controller (BMC) on server.

Below shows some of the latter 

 






 from https://link.springer.com/book/10.1007/978-1-4842-7939-7.


A lot of the flows are blended. For example, some of the descriptions of ACPI and SMI for errors described in https://cdrdv2.intel.com/v1/dl/getContent/671067 but the paper is light on ACPI.

Regrettably the collection at https://www.amazon.com/stores/Vincent-Zimmer/author/B002I6IW4A is pretty spare on ACPI, too, although https://link.springer.com/book/10.1007/978-1-4842-7974-8 covers construction and design in the case studies.

A curation of ACPI material can be found in the following:


Public documents on ACPI

2022

Intro to ACPI from ACPI spec

https://uefi.org/specs/ACPI/6.5/01_Introduction.html


2019 

ACPI tutorial

https://acpica.org/sites/acpica/files/asl_tutorial_v20190625.pdf 


2013

ACPI and UEFI

https://cdrdv2.intel.com/v1/dl/getContent/671067 

APEI and UEFI


ACPI overview https://www.intel.com/content/dam/www/public/us/en/documents/research/2009-vol13-iss-1-intel-technology-journal.pdf


2009

ACPI and SMM

https://link.springer.com/article/10.1007/s11416-009-0138-0 


2006

ACPI attacks

https://www.blackhat.com/presentations/bh-europe-06/bh-eu-06-Heasman.pdf 


2004

ACPI HOWTO

https://tldp.org/HOWTO/pdf/ACPI-HOWTO.pdf


2003

IA64 book, including UEFI and ACPI chapter

https://www.amazon.com/Itanium-Architecture-Programmers-Understanding-Processors/dp/0131013726

 

2001

ACPI Book

https://dl.acm.org/doi/book/10.5555/940719 


1997 

ACPI implementation guide

https://www.baldwin.cx/~phoenix/reference/docs/acpi_impguide.pdf 


1996 for first spec - full history in 
https://en.wikipedia.org/wiki/ACPI 

 
A colleague suggested that I should have spent more time in the books writing about ACPI since he spends his time debugging issues on Windows and can quote many instances of poor or misunderstood ACPI constructions. I also chronicled in the past my exchange https://vzimmer.blogspot.com/2018/ with the Linux kernel leadership on the topic. 

A lot of people grouse about UEFI and ACPI when in fact it isn't the standard they are critiquing but instead the implementation.  Namely, it's often a confusion between interface and implementation. 

For example, on topics like the firmware support package (FSP), I hear complaints. I often offer the following dichotomy there:

    Consumer (caller)

    Interface (specification)

    Producer (implementation)


You can find examples of using FSP (consumer/caller) with EDKII 

Sometimes people grouse about FSP when in fact it's the implementation, not the API. And it has been a long run on FSP. The Intel IOTG folks kicked it off in 2010 and then Jiming and I conspired to have a working group that I've lead/co-lead since 2014 on the same with various other co-leads and collaborators along the way. Some of the outputs of that collaboration that record Maurice, Ravi and Jiming details can be found in the '15 book https://link.springer.com/book/10.1007/978-1-4842-0070-4, too. Maurice is now doing great things in fw at MS and Jiming at AMZN, resp., AFAIK. One of my key contributions in those 2013/2014 days was teasing out the FSP spec which was originally an amalgam of silicon details and API's into a couplet of docs, namely the main interface spec, the FSP EAS, and the respective SOC integration guides, such as can be found today in https://www.intel.com/fsp and in https://www.github.com/intel/FSP, respectively. This allowed for creating class drivers for FSP in the various platform code (e.g., coreboot and EDKII at the time) and abstracting SOC specifics. And I should forget how the decade+ of cross-group collaboration allowed for scaling FSP's from embedded to both mainstream client and servers, as demonstrated by the rich postings on Github. It's not a perfect split between EAS and integration guide, though, but as we've struggled with the 2001 Intel Framework API's and 2006+ UEFI PI spec interfaces, building that 'firmware socket' set of abstractions is tough given the variability of silicon and products over time.

Same with UEFI.  The interface is at https://uefi.org/specifications. Two implementations can be found at https://github.com/tianocore/edk2/tree/master/MdeModulePkg/Core/Dxe and https://github.com/u-boot/u-boot/tree/master/lib/efi, respectively. Most folks complain about the EDKII implementation when they invoke 'UEFI is broken', I feel. This reminds me of the sentiment that some person with the handle of 'the_panopticon' mentioned in https://news.ycombinator.com/item?id=39481434

So back to ACPI. The evolution of ACPI in the did-90's predates late 90's EFI. EFI started as a sample and then the Tiano project with EDK and then EDKII subsumed Framework/PI and EFI/UEFI but not the ACPI specification. As such, no modularization of ACPI from the beginning because of Conway's Law (i.e., the folks who owned the UEFI spec and its reference implementation didn't own the ACPI spec for the first nearly 2 decades). In fact a lot of the ACPI table construction on the Intel platforms was derived was the DaVinci/Kittyhawk clean-room C BIOS that pre-dated Tiano in that hotbed of late 90's BIOS innovation called DuPont, WA Intel site. This model of static tables differs from the ARM ecosystem which does dynamic table generation since they started much later on the journey and had a single-team/Linaro view of entering standards-based firmware.

An advantage of dynamic table generation is flexibility but a downside is that it is harder to do attestation since the table fields are not known at build time. Even build time calculations, though, are aggravated by patching. So in practice it is not really done, just like a lot of configuration and PCR[1] content not really be reconstructed for some attestation / verifier flows.

Thinking about these taxonomies of firmware, from SAL+BIOS to Kittyhawk to ACSFL to EDK to EDKII to slim bootloader to coreboot to.... I realize that I might have the dubious honor of having worked on the broadest variety of host firmware at my employer.

Time marches on. In mentioning DaVinci/Kittyhawk workstation BIOS, I realize that my colleagues on that late 90's adventure have largely left the company, from retirement to downsizing to becoming execs at other tech shops (e.g., MS). Similar to the thinning of the crowd of others, such as even my 2015 colleagues from https://www.usenix.org/conference/woot15/workshop-program/presentation/bazhaniuk who have all left for startups (e.g., Eclypsium) or retirement or other big tech (e.g., Amazon). Well, given those statistics and chaos in tech, this might by be my last Next^* blog. Even if it is, though, I have enjoyed the run and people I've met along the way. Hopefully I have re-payed my employer's and colleagues trust with sufficient contributions these last 27 years.  

Cheers