Idea for muOS: Native support for space-saving archives (.ssmc) and rich metadata containers (.gcase)

Hello muOS developers and community,

First off, thank you for creating such a fantastic and streamlined custom firmware. I’m a huge fan of the work you do for the retro handheld community.

I’m a developer who has been working on two open-source projects specifically designed to improve how we manage our retro game collections. I believe they could be a powerful addition to muOS, and I wanted to propose them for potential integration.

The two projects are SpriteShrink and GameCase.

SpriteShrink: Solving the SD Card Space Problem

We all know that SD card space can be a real issue, especially when you have multiple revisions and regional variants of the same game. SpriteShrink is a command-line tool I built to tackle this head-on.

  • What it does: It takes all versions of a single game (e.g., USA, EU, JP, Rev A) and intelligently deduplicates them into a single, highly compressed .ssmc archive. It finds all the common data between the files and only stores it once.
  • The Benefit for muOS: Users could store their entire multi-region collection for a game in a file that’s often only slightly larger than a single ROM. This would free up a massive amount of space on the SD card, allowing users to fit more games on their devices.
  • How it would work: muOS could be updated to recognize .ssmc files. When a user selects one, the frontend could present a sub-menu asking which version of the game to launch (e.g., “Tecmo Bowl (USA)”, “Tecmo Bowl (USA) (Rev 1)”, etc).

GameCase: A Richer, More Organized Library

A game is more than just the ROM. It’s the box art, the manual, maybe even ROM hacks. GameCase is a tool and a file format (.gcase) I designed to bundle all of these assets together.

  • What it does: It creates a single .gcase file that can contain:
    • The game ROM (or even a space-saving SpriteShrink archive).
    • Game manuals in PDF or other formats.
    • Multiple types of artwork (box art, cartridge scans, fan art).
    • ROM hack patches and their metadata.
  • The Benefit for muOS: This could enable a much richer user experience. Imagine selecting a game in the muOS interface and having an option to open the game’s manual directly on the device before playing. Or, if a .gcase file contains a ROM hack, muOS could offer to apply it on the fly.
  • Note: GameCase is currently a work in progress so the potential integration focus can start with SpriteShrink for now.

Technical Details & Integration Path

I’ve designed both projects with integration in mind, and I believe the path to adding them to a C based project like muOS would be straightforward.

  • lib_sprite_shrink (For .ssmc files):
    • The core logic of SpriteShrink is available as a Rust library.
    • Crucially, I’ve already built a C-compatible FFI layer and use cbindgen to generate a C header file (.h).
    • This means the muOS developers could link against lib_sprite_shrink.a or .so and call C functions to list the contents of an .ssmc archive and extract a specific file’s data into a buffer, ready to be passed to the emulator.
    • The library is licensed under the MPL-2.0, which allows it to be linked into a larger project without imposing its license on the entire codebase.
    • Another potential path, once I get the json based output done for the cli-application, is to get the file metadata from the archive, populate a screen much like a folder in the muOS interface, and once the user selects a file just use the cli binary to extract the ROM to a cache location and run it from there.
  • lib_game_case_parser (For .gcase files and when it’s done):
    • The .gcase format is based on EBML (the same tech as .mkv files).
    • I’ve written a dedicated Rust parser library for it, which is also licensed under the permissive MPL-2.0.
    • While I haven’t built an FFI layer for this one yet, it would be a very similar process to the SpriteShrink library if there was interest from the muOS team. The cli suggestion like I had for SpriteShrink likely wouldn’t work well with GameCase integration since I can see a large number of flags being used for this one.

I have already successfully cross-compiled and tested SpriteShrink for the RG35XX+ using the community toolchain, so I’m confident it can run efficiently on the hardware muOS supports.

Conclusion & Links

I believe that integrating these tools could offer muOS users a significant advantage in managing their libraries, saving them space and providing a richer, more organized experience.

I would be more than happy to assist in any way I can, whether that’s providing pre-compiled ARM libraries, helping with the integration code, or answering any questions the development team might have.

Thank you for your time and consideration!

Best regards,
Zade222

2 Likes

Does this support other archive formats? There’s nothing special sounding about block level compression there for .ssmc - case in point, several collections I’ve seen are distributed in .zip that is doing the same thing, especially for regional roms and romhacks where most of the bits are the same between the images.

I’m not sure I understand what you mean.

The more everyday file archiving formats like zip and such do not look much beyond each file and their shared byte content while ssmc in this case does due to the dictionary generation. To be more specific each file is compressed and stored as a distinct entity. If an archive contains multiple identical or similar files, the redundant data will be stored for each instance, leading to a larger overall archive size. 7z and rar do offer similar functionality from looking into it a bit.

Tar archives compress smaller however such a solid archive format will cause extraction or random access to suffer. For really small games like nes it’s less of a concern but larger ones it’s more noticable. As I stated in the GitHub I hope to expand this to apply to larger files like optical media files like isos so individual chunks can be requested and extracted as needed.

After some discussion on reddit with this project I’ll provide some more pros and why below:

Superior Deduplication with Global Chunking
Unlike the “sliding window” method used by solid compressors like 7-Zip, SpriteShrink first scans all files and breaks them down into content-defined chunks using FastCDC. This allows it to create a global database of every unique piece of data. As a result, it can find and eliminate redundant data across the entire ROM variant collection, regardless of where the data appears. A sliding window can miss duplicates if they are too far apart in the data stream to be in the window at the same time.

Fast, Efficient Random Access
A major drawback of solid compression is that to extract a file near the end of an archive, the entire preceding stream must be decompressed. This is very slow on SD cards or network drives common in emulation. Because SpriteShrink compresses each unique chunk individually, it only needs to decompress the specific chunks required for a single file, allowing for much faster random access.

Greater Resilience to Corruption
In a solid archive, a single corrupted bit can make all subsequent data in the archive unreadable. With SpriteShrink’s chunked approach, corruption is isolated. If a chunk is damaged, it only affects the specific files that rely on that single piece of data, leaving the rest of the library intact.

Preserves Original ROM Data
There have been suggestions for using 1G1R sets (which often involve trimming ROMs to remove unneeded data), however I feel this is not a good approach because that such modifications can sometimes cause problems with emulators. SpriteShrink’s approach is non-destructive; the ROMs remain untouched, simply compressed in a way that allows them to be restored perfectly, avoiding potential headaches.

Designed for Library Management
While emulators can run a single ROM from a .7z or other solid compression based archive, file, you cannot package all variants of a single game into one file and have the emulator efficiently select one game. SpriteShrink is designed, with future integration in mind, to allow for exactly that: storing a complete, deduplicated variant set in a single archive from which any individual game can be quickly extracted on demand.

Hello again, everyone!

I’ve been running some benchmarks with SpriteShrink and wanted to share the results to better show it’s merit. I wanted to share them with you all, especially the dev team, as I think I can help muOS in some areas in addition to it’s use with ROM compression.

Firmware Image Compression

I know that packaging and uploading new firmware images, especially for multiple devices, can be time-consuming. I ran some tests on the H700 and TrimUI firmware images for the recent Goose release, and the results are better than I was anticipating.

Here is a table for the results from compressing the images compared to more traditional archives (I apologize ahead of time for the table screenshots, it’s the only way I could get them remotely readable on here):

Given how well the images compress using SpriteShrink it seems there is some merit in using SpriteShrink as part of the delivery for new system images. To be more specific if a collection of images use the same CPU/architecture, meaning the compiled binaries are largely the same, if not actually are the same, in many cases will make the .ssmc file be quite small since all the images share quite of bit of duplicate data.

This could translate to a massive reduction in both upload times for XongleBongle (or whomever is largely responsible for uploading data) and download times for end users. Instead of distributing separate large images for each device, a small number (1 to 3 potentially), highly-compressed .ssmc archives could be used instead of individual zip archives.

I am also working on a binary that will query for the first found .ssmc file in the folder of where the binary is executed, it will then read the metadata of the archive and print to the console a list of the files in the archive and wait for the user to specify which file(s) they want. Upon the user providing their selection will decompress to a chosen location, or the folder from where it’s being executed from (haven’t made up my mind which is better.) That way you can then, in a zip in store mode (so it won’t compress anything) provide a decompression binary for each major OS (macOS, Windows and linux), the .ssmc file contain the images, a readme or anything else you may want to distribute with it.

If the binary is used for decompression it also checks the integrity of each chunk of data as it’s decompressed, that way if a single chunk is found to not having a matching hash, the application will notify the user that the archive may be corrupt. This alleviates support effort on your part if the user get’s a bad download.

Potential For Update Delivery

It could also help with distribution of updates. Should you decide to start ditributing updates that need to be catered to each individial device, instead of putting each into a zip the following could potentially be done instead:

  1. Prepare each devices’ update.
  2. If each update is a single image then feed into each into SpriteShrink. If each is a collection of files, instead of zipping it tar them instead to serialize the data, then feed each tar into SpriteShrink.
  3. With the right internal logic of the OS/frontend, the user could then open the .ssmc file containing all of the update data, the system based on some criteria would identify the correct update in the .ssmc file, decompress/hydrate the tar, pull the files from the tar and then move/copy the files to their respective locations much like the muxzips currently accomplish, thus updating the system.

And since the SpriteShrink binary checks each chunk as it is decompressed, this verifies update integrity as the tar is decompressed/hydrated. Again reducing potential support headaches for a circumstance where a user had a bad download or the .ssmc update archive file was corrupted upon being written to their SD card.

Game Benchmark Results

Here is a table with the benchmarks using a couple of games, which is the main/original purpose of this archive/file format:

As you can see the results are pretty good! And given the internal nature of the archive querying the information from the archive is quick so it works well over slow media like an SD card and/or network shares. This is because it doesn’t need to parse the whole archive for data in the middle or end to decompress something.

I’d love to get my hands dirty and work on the integration. Just give me a thumbs up and I’ll get to work once I get the beta of SpriteShrink out, which I am currently working toward in the next week or two.

I was remembering .7z archives, not .zip, my mistake.

I’ve been dealing with a variation of this problem for a while - instead of forcing someone to pick which rom from a set each time they want to play it I think its generally better to extract discrete roms and play those outside of having to manage the whole set. if we could someohow support referencing/loading/etc. a specific rom within the archive, that would be completely novel and worth pursuing (maybe via a fuse FS?)