Unpacking the Engine: A Deep Dive Into the Win32 Portable Executable File Format

It’s funny how certain pieces of writing stick with you, isn't it? I recall writing an article way back in 1994 for Microsoft Systems Journal – the precursor to MSDN Magazine – called "Peering Inside the PE: A Tour of the Win32 Portable Executable File Format." To my surprise, it became quite popular, and even today, people reach out, mentioning how they still use it. But here's the thing about articles: they're snapshots in time. And the world of Win32 has certainly evolved since then.

That's why, starting with this piece, I wanted to revisit the Portable Executable (PE) file format. Think of it as the blueprint for all the applications and libraries (.DLLs and .EXEs) that run on Windows. Understanding this format isn't just for the sake of trivia; it genuinely makes you a more knowledgeable programmer. It’s like knowing what’s under the hood of your car – it gives you a deeper appreciation and better control.

Now, you could wade through the official Microsoft specifications, and they are, of course, comprehensive. But let's be honest, specs can be a bit dry, prioritizing completeness over readability. My aim here is to cut through the jargon, explain the crucial parts, and fill in the 'hows' and 'whys' that often get lost in formal documentation. I even have a few tidbits that you might not find in the official docs.

So, what’s changed since my last deep dive? Well, for starters, 16-bit Windows is a distant memory, so we don't need to compare the PE format to the old Win16 New Executable format. And thank goodness, Win32s – that rather shaky attempt to run Win32 binaries on Windows 3.1 – is also gone. Back then, Windows 95 was still a twinkle in Microsoft's eye, and Windows NT was only at version 3.5. The linker folks hadn't yet started their aggressive optimization routines, and we were also dealing with different processor architectures like MIPS and DEC Alpha.

Fast forward to today, and the landscape is vastly different. We have 64-bit Windows, which introduced its own variations of the PE format. Windows CE supports a whole host of new processor types. Features like delay-loading DLLs, section merging, and binding were still on the horizon. And then there's the elephant in the room: Microsoft .NET.

From the operating system's perspective, .NET executables are still just plain old Win32 files. However, the .NET runtime understands the metadata and intermediate language embedded within them. We'll touch upon this .NET metadata format here, but a full exploration will have to wait for another article.

And if all these additions and subtractions weren't enough to warrant a fresh look, I've also discovered some inaccuracies in my original piece that make me cringe now. My explanation of Thread Local Storage (TLS) support, for instance, was a bit off. And that date/time stamp DWORD used throughout the file format? It was only accurate if you happened to be in the Pacific time zone! Things that were true then are simply incorrect now. For example, I once stated that the .rdata section wasn't particularly important, which is far from the truth today. Similarly, my assertion that the .idata section was a read/write section has also been revised.

This article, the first in a two-part series, will cover the evolution of the PE format, its core structure, how it accommodates .NET applications, the significance of PE file sections, Relative Virtual Addresses (RVAs), the Data Directory, and the process of importing functions. We'll even include an appendix with relevant image header structures and their descriptions. It's a journey back into the heart of how Windows executables work, updated for the modern era.

Leave a Reply

Your email address will not be published. Required fields are marked *