Straight to the source —

Beyond emulation: The massive effort to reverse-engineer N64 source code

It's about much more than just enabling PC ports.

Beyond emulation: The massive effort to reverse-engineer N64 source code
Aurich Lawson

Early this week, with little warning, the Internet was graced with a Windows executable containing a fully playable PC port of Super Mario 64. Far from being just a usual emulated ROM, this self-contained program enables features like automatic scaling to any screen resolution, and players are already experimenting with adding simple graphics-card-level reshaders, including ray-tracing, as well.

The PC port—which was released with little buildup and almost no promotion—wasn't built from scratch in a modern game engine, in the manner of some other now-defunct Super Mario 64 porting projects. And its release has nothing to do with a recent leak of internal Nintendo files dating back to the Gamecube days.

Instead, the port seems to be a direct result of a years-long effort to decompile the Super Mario 64 ROM into parsable C code. This kind of reverse-engineering from raw binary to easy-to-read code isn't a simple process, but it's an effort that a growing community of hobbyist decompilers is undertaking to unlock the secrets behind some of their favorite games.

Decomp 101

The two-year effort to decompile Super Mario 64 wasn't started with a Windows executable in mind. Instead, it was motivated primarily by speedrunners who wanted "to understand the game's code better in order to help find those [speedrunning] exploits," according to Kenix, who's currently helping to head the Zelda Reverse Engineering Team (ZRET) that spun off from that effort.

A hidden room from a debug version of <em>Ocarina of Time</em> that helped unlock some reverse-engineering secrets to the game.
Enlarge / A hidden room from a debug version of Ocarina of Time that helped unlock some reverse-engineering secrets to the game.
The first step in reverse-engineering an N64 ROM is as basic as figuring out which specific version of the Silicon Graphics IDO compiler was used to create the ROM in the first place. This requires a lot of trial and error testing, helped in some cases by careful parsing of leaked debug builds and source code snippets left buried in the ROM files themselves. And even running such an outdated compiler for testing means emulating the SGI workstation system calls that original N64 developers used, Kenix told Ars.

The next step is simply figuring out how the ROM is organized. "How do you know where functions are? How do you know where polygon or texture data is?" Kenix said. "You have to analyze the internals of the ROM and then generate a way to split it into those various files."

Fortunately, N64 games arrange their files in 16-byte chunks, which can make it easier to see the empty "padding" marking the end of a file. And some game, like Ocarina of Time, use an easy-to-parse direct-memory-access table that defines most of the file boundaries found in the original ROM (these days, a tool like N64Split can automate this process). Debug builds of a game can also help reverse engineers document its structure, thanks to the presence of uncompressed game files and C macros like __FILE__ and __LINE__ that reveal internal file names used by Nintendo.

Hand-crafted functions

With the ROM's compiler and basic structure known, simple decompilation techniques can generate a sprawling list of the raw assembly language instructions that are fed to the N64 hardware. But converting those instructions to C code that's parsable and easily editable by humans is far from a simple process (and automated tools that convert that assembly code to C often introduce logic errors or obfuscate the code too severely).

Thus, truly reverse-engineering an N64 ROM means going through those assembly code files function by function, converting them by hand into usable C code. And unlike emulation, where "close enough" is sometimes sufficient, precision is important here. "Our goal is to match byte for byte the original assembly code of all functions in the game [after running through the compiler]," Kenix said.
A sample programming environment that lets coders easily compare the compiled results of their code to the actual game ROM in close to real time.
Enlarge / A sample programming environment that lets coders easily compare the compiled results of their code to the actual game ROM in close to real time.

Even converting a small function of a few assembly instructions in this manner can be a complicated process. But individual N64 functions can run into the thousands of instructions, and a single N64 game can have thousands of such functions (over 15,700 in the case of Ocarina of Time, for one example).

The difficulty can vary by game, as well. For Super Mario 64, Nintendo compiled its source code without any fancy compiler options, meaning the decompiled assembly language is simpler to convert back to C code. For a game like Ocarina of Time, though, Nintendo used optimization flags to generate faster code, making the resulting ROM that much harder to untangle back into its source.

"When there are optimization flags, you have a harder time matching a loop to a 'for' vs 'while' [statement] etc.," Kenix said. "You have to try all equivalent patterns of code until you find the one that matches."

More than just ports

Mario looks great in high-definition thanks to a PC port, but that's not the main point of decompilation efforts.

While ZRET leadership understands that PC ports are going to be a natural result of their efforts, Kenix said reverse engineers "consider that outside of the scope of what we do. We just decompile the game. Someone else will inevitably pick it up and write the PC port."

But even with decompiled C code in hand, making a PC port is "not as easy as just [saying] 'compile it for Windows,'" ZRET member Rozlette noted. "There is a lot of code that deals with talking to N64 hardware. The N64 render pipeline is very different than modern OpenGL, for example."

The process is "close but not quite" as complex as just writing an N64 emulator in the first place, Kenix said. "It remains quite difficult, especially when considering changes that are considered implicit with a PC target, like being able to change the resolution or framerate," ZRET member Roman added.

Ports aside, having the source code opens up a potential new world of mods and hacks that would be difficult or impossible by just building on top of the binary ROM. Before Zelda decompiling efforts even began in earnest, for instance, more basic reverse-engineering efforts were key to discovering the amazing method for getting Star Fox 64's Arwing into Ocarina of Time.

After a few months of work, the Zelda Reverse Engineering Team has only unspooled about 15% of Ocarina of Time's functions into C code. With time, though, they're hoping to get the game's source code to a point of "shiftability," where wholesale changes to the game can be easily coded in C rather than assembly. That's already the case for games like Super Mario 64—since the game's source code was released last September, modders have created new tools that allow for easy world editing, background art, in-level warp zones, and more.

For Zelda, "shiftable" source code could also lead to a new, more fully featured version of the Ocarina of Time Randomizer, which moves in-game items and objectives around. A randomizer built from source code could exist as a standalone ROM rather than a patch that has to be applied to the game ROM with a new seed every time, for instance.

"The knowledge learned through binary hacking of the game for years made [the Ocarina of Time randomizer] possible without decompilation," ZRET member Fig said. "We can do a lot with just assembly and general knowledge of the game. C just makes it easier to do things and will enable things that would generally be considered to be too difficult."

For some reverse engineers, though, unlocking the mysteries of N64 code is its own reward. "I do it because of my childhood love for the game," Rozlette told Ars. "It feels like a big puzzle to me where each function is a piece. It's very rewarding to me when I work at an unknown function of code and then realize I recognize what this does in the game, like, 'Hey! This is the function that spawns rupees when you cut grass!'"

The Zelda Reverse Engineering Team is always looking for more volunteers. You can sign up via the team's Discord channel.

Channel Ars Technica