Use QOI for Sprites!

While working on TIME FALCON, one of the questions that came up was how small we could get the game's files. We don't have a final answer to that for another year at least, but I began making some progress towards trying to make future builds more compact. Smaller means faster downloads over the internet, and better accessibility and portability between platforms.

One of the first things to consider was our reliance on the standard library. This might not sound like it contributes to size since this is usually part of the system, however on non-Unix platforms (Microsoft Windows, and eventually AmigaOS 3, since I intend to write a port for that too) this had to be bundled with the application since we specifically use the GNU standard library. Others like newlib and musl should be compatible, but we did have issues with the Microsoft CRT causing unexplained corruption during development on Windows, so glibc is what we're sticking with for the PC version. And outside of the PC, where I want to backport to a platform that's been abandoned for decades (well, almost), you can't rely on a trustworthy C++11 standard library to come with the system.

I ended up dropping attempts to make the code smaller since it was not of great importance. Naturally, the next idea was to try and drop the size of assets.

I explored a few options. At first, I used 32-bit PNG for the PC version where we had lots of resources to spare, and 8-bit indexed PCX elsewhere. Occasionally changing to either format with grayscale, 1-bit, or just a smaller indexed palette for things like opacity masks. This already made a difference, but I could do better.

First we need to address the elephant in the room: S3 Block Compression, or BCn for short, where n is a number describing the color format (it's also known as "DXT compression" according to Microsoft). If you're making a 3D game with lots of grainy textures and fine shaded details, you can stop right here. This is the best you're probably going to get. It's the fastest to load and gives the best compression to perceptual quality ratio when used appropriately. S3 Block Compression is designed specifically with modern GPUs in mind and it's supported by a wide range of hardware dating back over two decades. It's the only format that is kept compressed after load and decoded in real time by the texture sampler as it's blitted to the screen, so it not only saves on storage but saves runtime memory too. Even though S3 Graphics isn't around anymore, new versions of their work are still churned out to meet increasing demands in triple A games (and we stopped calling them "DXT"). How cool is that?

It's very cool! Unless you're making a 2D game with lots of low-res or vector artwork. Block compression formats like BCn look like steaming crap when you put them onto fine lines, let alone pixel art. Case in point, here's part of a decal made by one of the contestants in my Source Engine "Rivalries Spray Contest" on Gamebanana. Pay attention to the outline around the hand.

What's happening here is that each block is only allowed to store two colors, kind of like the "color cells" on old computers like the Commodore 64 or NES if you're more familiar with that. Each block then also has data storing a low-precision fractional that determines how much to blend the two colors. So the GPU can only draw the pixels within the outlined blocks with a gradient between orange and black. Once it tries to draw blue, it can't, so it approximates with the black color, and you get those jagged stair steps. The more color variation and contrast within a block, the worse it gets.

These artifacts are very well hidden in noisy surface textures and photographs, which is why a similar technique is popularized on the web by JPEG, and also used by WEBP where enabled. The file size reduction additionally provides headroom to increase the resolution, meaning individual blocks consume less of the total surface area, which can further hide the mess. Here though, they are a noticable compromise.

It's also not a good format for our game even if it did look presentable, because we intend to run it on platforms that don't have 3D acceleration, and these complex block compressed formats need to be supported at the physical hardware level, else they waste lots of CPU cycles and memory to decompress at runtime. We want to load textures from disk on the fly, on a single thread, so this isn't acceptable. What is a poor game developer to do??

This is where QOI comes in. It's a lossless format, like BMP, RAS, ILBM, PCX and PNG are, but it has some key advantages that the time-tested formats lack. It strikes a balance so you no longer have to choose between shorter load times or smaller file sizes. Now, you can do both well, in high quality.

You might recognize phoboslab for his WipEout fan works, the WipEout model viewer and the WASM-compatible WipEout Rewrite. A few years ago, he published his QOI format specification, which is only a single page (wow!) and comes with a microscopic reference decoder that runs great on embedded devices (holy hell!!). Funny enough, it was intended to be a non-serious experiment, hence the name "Quite OK Image". Yet it's gained popularity in recent years for it's close competition with the long-standing PNG format. Conveniently given our multi-platform usecase, it's also already been reimplemented as an Amiga Datatype. If you don't know what datatypes are, they're awesome and I'm deeply sorry that IBM has robbed you.

QOI's greatest strength is it's simplicity. The author describes it as "stupidly simple". This means two things:

QOI files decode really damn quickly. Each pixel is processed only one time and is only three or four bytes long when decompressed.
You can modify the format and write your own decoder for it with very little effort.

There is a third, undocumented benefit, lying within what at first glance appears to be a major shortcoming: QOI files feature a lot of redundancy. One easy criticism against QOI therefore is that it doesn't serve as much practical purpose as a delivery format over the web or for individual images. This changes drastically in the context of video game development, where load times from local storage are seriously important, and thousands of these bitmaps will be delivered simultaneously during installation, not at random times from random places. So, there's more opportunity for cross-compression between lots of files. Despite employing multiple techniques to reduce redundancy, and being very good at it, there is still enough leftover for another lossless compression algorithm like zlib or LZ to take it further. The obvious next step then is to put a QOI in one of these compressed containers. This is what PNG does, after all, with a notable performance cost. But we can do better than that. Since we're distributing lots of these QOI files in a single asset pack with a game, we can keep using the standard QOI algorithm and offload any further compression to the asset pack containing all of them, where the encoder creating the final archive file can take every single sprite in the entire game into context. The result?

Okay, it's not a monumental difference here, but that was only with six test images, the total of which was 8.1 MB when just using QOI. Most of that is coming from just one very high res 5.6 MB image that couldn't compress very well to begin with. With or without it, QOI images seem to additionally compress with a ratio from 68% to 76% when packed into a ZIP using PeaZip's "normal" preset. This is a tiny sample size but the differences are sure to add up with larger dictionary sizes and with a larger set of images compressed together.

Does this kind of defeat the point of QOI's simplicity? Somewhat. But these parts of game development are all about making subjective tradeoffs. And it's still definitely an improvement over PNG where you have to do two decoding passes, a zlib unzip, and then another unzip for the complete archive file containing the PNG. I wanted to make more people aware of this option because it seems like a real missed opportunity for free storage and the load times are surely much faster.

There is one catch. QOI's official support for different color modes is lacking. It has no indexed palette mode (though it uses a 64-color LUT for compression), and is horrible for single-channel bitmaps. The official specification only supports either three or four 8-bit color channels, either BGR or ABGR. If you're packing more than four channels, you'll want to either use a separate color texture for that extra information in multiples of 3 or 4 channels, or expand the format to hold more channels per file. For other uses like opacity masks, bump maps or 2D normals, you might want to try something else.

If you use Allegro, you're in luck! I wrote a plugin to add QOI support to Allegro. It uses the reference decoder and that introduces some minor problems with the way data gets copied around, but those are easy fixes. Allegro already has built-in integration with PhysicsFS which made this a no-brainer. I talked with SiegeLord and there is interest in merging this upstream into Allegro 5.2.11 in the future. Canithesis will be including the plugin with the public release of Mythos Engine, as well as our own refined format based on it. Cheers!