A History of OpenOMF - 28 Mar 2025
As we are about to release 0.8.0 of OpenOMF, I wanted to look back a bit on my involvement with the project, and its predecessor, which go back to late 2004, or really to 1994. I am going to recount the story mostly from memory, so there may be some errors or misconceptions in what follows.
One Must Fall 2097 was a DOS fighting game for the IBM PC. It was developed by a small Florida game developer company called Diversions Entertainment, and it was published by Epic Megagames. The game was the commercial version of an earlier shareware fighting game (which we call omf 1) which a young programmer named Rob Elam had released. For 2097 the game was massively expanded to include 10 unique fighting robots (called Human Assisted Robots or HARs in the game’s lore), 10 single player pilots for those HARs, a single player boss character, a tournament mode with RPG elements and a remarkable amount of game options and secrets.
I was first exposed to the game via the shareware demo, which I believe we got on a CD or floppy taped to the front of a computer magazine (this was the era in which downloading more than a few hundred kilobytes from the internet was an all day affair). My brother and I, having never really played a fighting game outside an arcade before, were enthralled. We played the heck out of the demo and quickly convinced our parents we needed the full copy. My parents did whatever bizarre ordering procedure the time called for, and a few weeks later a box edition of the game arrived, complete with the manual, a poster and a strategy guide (all of which I still have). We then proceeded to play the game obsessively for most of a summer vacation.
I think everyone has some encounter with media that hits them at just the right time, whether a book, a movie, a song or a video game. You’re receptive to it in some way that makes it hard to explain to others because in consuming the media you are yourself changed by it. This was one of those pieces of media for me. When I taught myself 3D modeling some of my first ever 3D models were HARs from 2097.
Once the Internet was more of a thing, in the late 90s and very early 00s I discovered Diversions Entertainment was working on a 3D sequel to OMF, called One Must Fall:Battlegrounds. I dabbled a bit in the online community that had formed around the community, and I tried Battlegrounds when it came out, but I found it a bit underwhelming and clunky compared to the original.
Several years later, I had graduated high school (barely), dropped out of college (more school just wasn’t what I could do), had spent a year abroad living in Germany, and then returned home to Ireland, at a bit of a loss with what to do next. For some reason I decided to pick up OMF2097 again. I found that, while the game had had networking support added and the game itself had been made freeware in 1999, it no longer ran well under Windows 2000 and you had to use something called “DOSBox” to run it. However, I could never get the game to “feel” right under DOSBox, no matter how much I tweaked the cycles or the settings. I had also, in the intervening years, learned how to program, primarily in a “new” language called Ruby. I decided I was going to try to recreate the game using Ruby and a game engine called Gosu. I had done a bit of OpenGL and C++ programming before this, and decided I wanted nothing to do with it, so Ruby/Gosu let me focus on the parts that I found interesting.
I had found there were some fan-made tools for unpacking/repacking some of the game assets, especially the “AF” files, which is where the HAR information was stored. These tools also documented the binary file format, and how to extract it. I then had to teach myself how to work with binary files from Ruby (turns out String.unpack/pack support some pretty complex specification strings). I then wrote some tools to decompile the assets into sprites and giant XML files of the known data. This proved to be a mistake, as I spent a lot of time messing around with updating the representation of the data as I learned more about it.
After a little while, I had something that looked a bit like a game (although it didn’t really act like one). I created a RubyForge (RIP) project for it called rubyomf2097 and posted about it to the OMF forums. People were interested, but cynical it was going to lead anywhere (apparently I was not the first person to tilt at this windmill, although I believe I got the furthest). Eventually life got in the way, and I sort of stalled out on the project (although I had developed some tools for editing the asset files and learned a whole bunch along the way). There was just too much unknown about how the game worked, and things seemed to be much more complex than they might appear at first glance. I did remain around the community, and in the #omf IRC channel on Freenode (RIP).
Then, sometime in 2012, someone called “katajakasa” posted about their OMF2097 remake, this time in C++. I had been programming professionally for several years by then, and had done a fair amount of C programming. I also had done enough C++ to realize I really didn’t like it, so I proposed joining forces if he agreed to switch to C. He agreed so he and I and another OMF2097 fan from Australia, “animehunter”, joined forces and started on another remake. We ported over what we had from the 2 previous codebases and started on implementing libraries to implement encoders/decoders for the various game formats. As this progressed we also started building a new game engine from scratch, using SDL2 as the base to give us basic things like window handling, input, etc.
We made pretty good progress for the next couple years, but after about 2014 the pace of the project slowed. It turned out the game we had decided to reimplement was vastly more complex and confusing than we had expected. The game had its own internal scripting language that was used to control what effects would happen on each frame of animation. This scripting language was difficult to understand and reverse engineer given our tools and skillset. Katajakasa did some decompilation using IDAPro, and I would use our tools to decompile the assets, edit them and recompile them to see what would change in the original game. This was extremely tedious and error prone, although we did manage to solve several mysteries, like how collision detection worked, and a bunch of other game mechanics (move types, how moves chain together, etc).
I also implemented a version of network play, using somewhat more modern methods (the original used IPX/SPX in lockstep mode, where nothing could happen until the other side acknowledged it), although I learned the hard way that fighting games are notorious for being the hardest game type to write netcode for. The approach I took ended up being very brittle and flawed, but I lacked the energy to try again.
So the project went somewhat dormant. We had some contributions from the community, katajakasa kept working on things here and there, but I had essentially stepped away from doing anything, as had animehunter. I returned briefly in early 2023 to implement the majority of Tournament mode, but then I went dormant again. katajakasa had been working on a rewrite of the rendering layer for a few years slowly (turns out simulating a VGA video buffer in modern OpenGL is a bit tricky), but progress was pretty slow.
Then, miraculously, things started to come back to life around January of 2024. A few new contributors arrived; martti, Nopey, Insanius and nopjne. We also started using Ghidra for reverse engineering (we had been using it a little during the lull as well). In August I left my job to take a break, and I decided to spend some of my programming energy on OpenOMF. I started with rewriting the network code from scratch, implementing a proper GGPO rollback style netcode, which ended up being as difficult as expected. I also implemented a network lobby, NAT support and UDP hole punching support for the network client.
We finally made an official release for the first time in over 10 years, 0.7.0 (and a couple followup bugfix releases), and we’ve even packaged the game for Flatpak.
An intrepid contributor managed to port the game to the Nintendo 64 using libdragon. A very impressive achievement, and one we intend to support in the mainline codebase. This has proven the efficiency and portability of our engine, and hopefully will help lay the groundwork for further ports.
We also finally landed the new rendering code, and have been rapidly progressing on features and bugfixes since. We’ve restored and repaired support for the game recordings from the original engine, and we’ve figured out how to use them both as a way to inspect behaviour in the original engine, but also to embed assertions into them our engine can check, so we can also use them as unit tests.
We (mostly Insanius) also documented the memory layout of the original game enough that we can dump player position/velocity/health/endurance/etc at runtime. I wrote a simple C utility called OneMustSee that can be pointed at a dosbox pid. This allows us to play back a known recording in the game, use the memory dumper to dump the memory values, then use those values to annotate the REC for playback in our engine. This currently reveals a LOT of small incompatibilities, but we have finally developed a pretty robust suite of tools for interrogating the original engine and ensuring our own complies.
With the release of 0.8.0, we are considering the game to be in “alpha” state, meaning that all the major features are implemented. Minor features may not be implemented, and there may be some bugs or incompatibilities. The next focus will be on getting all the smaller features implemented and correcting whatever bugs we find along the way. Once we are confident that all features are implemented, we will tag a 0.9.0 and then work on fixing all remaining known incompatibilities until we reach 1.0.
We are also exploring a mod framework for the engine, to allow for things like higher resolution assets, rebalancing, new arenas, enhanced features for tournament mode, etc. Our project is actually one of the only open source fighting game engines, and it has a unique lineage to all the other ones (because OMF2097 itself was a bit of a weird fighting game), so the idea of total conversions or other changes for the engine would also be possible.
If any of this sounds interesting, you’re welcome to swing by our Discord or GitHub. We could always use more people to test, report bugs, play around with reverse engineering or C code, or just hang out. Community engagement is all that keeps projects like this going, so if you know of a similar project you’d like to see continue on, make sure to let them know you appreciate the work they’re doing.
Looking back on 20 years of this project, in one form or another, maybe I can distill some lessons from it all. I think had we known then what we know now about what the scope of this project entailed, we probably would not have tried. This game turned out to be much more complex to implement than we expected, and have a lot of unique features and quirks. I do think, however, that I’ve learned a lot of useful things as a result. It taught me how to work with binary files, helped improve my C programming skills, my network programming skills, my ability to reverse engineer systems, how to use a debugger, etc. So if anyone out there is considering a similar project, do not be dissuaded, just prepare for it to take a bit longer than you expect. I do think we are finally in the home stretch, but we just don’t know exactly how far away the finish line is, still.
Finally, I’d like to thank everyone who HAS participated or contributed over all these long years. Every little spark of interest has helped us keep going.
Field notes on extending the Erlang packet parser - 30 Dec 2018
It’s that time again, dear reader, in which I get caremad about something and go off on a Quixotic adventure to do something about it. The target of my ire this time is binary network protocols that are not length prefixed and how to handle them in Erlang.
One of the great things in Erlang is active
mode for sockets and the
{packet, N}
option. Setting options like {active, true}, {packet, 4}
tells
Erlang to send the owner of the socket a message that looks like {tcp, Socket,
Payload}
every time it receives a 4-byte big-endian length-prefixed packet.
Even better, sending on that socket automatically prefixes the payload with the
4 byte prefix. This makes framing and deframing streams of data on sockets in
Erlang trivial, so long as both sides support and use this simple framing format.
It also allows the Erlang process owning the socket to do other things while the
packet is being accumulated by the runtime system. This is helpful because your
gen_server or whatever can just define a
handle_info
clause for packets instead of having to periodically read the socket for any
pending data.
This kind of length prefixed packet framing is reasonably common, thankfully (endianness aside), but it’s not universal. Herein lies the rub.
Consider, for example, the
Yamux packet
format. It consists of 4 header fields followed by a length byte. What’s wrong
with this you ask? Well, consider how you have to receive this protocol. First
you’d read 12 bytes to get the header, then read an additional N bytes to
receive the payload. This is fine, but it involves more tracking and buffering
as compared to the packet,N
approach, despite being essentially identical.
It gets even worse, consider the
mplex muxer
protocol. The protocol messages begin with 2 varints, one is the header flags
and the second is the payload length. This is a real pain in the ass because now
you can’t even do a fixed receive to read the packet length (I mean, technically
you can because the varints have a maximum length). Again though that’s a lot of
extra work as compared to packet,N
, you have to do a blocking recv of at least
whatever the maximum varint size is multipled by 2, or you can read it bytewise
and accumulate until you have all of both varints.
Another example is the UBX binary protocol (see section 33.2) used on u-blox GPS receivers. It has 2 bytes of sync word, 1 byte of message class, one byte of message ID and a 16 byte little-endian length field. It’s not a bad protocol and, in fact this is a good structure because it can be sent over transports where bytes can be dropped if they’re not received so the sync word is very necessary, but it again can be clumsier to work with than desired.
What if there was a better way? How does Erlang do its magic with packet,N
and
what other packet types are there? It turns out that it’s done with something
called the
packet parser
and it supports quite a few packet types:
raw
- No packet parsing1
,2
,4
- The packet,N mode described aboveasn1
- ASN.1 BERsunrm
- SUN RPC encoding, another classiccdr
- CORBA, nuff saidfcgi
- Fast CGItpkt
- TPKT format from RFC1006line
- Newline terminatedhttp
- HTTP 1.x response packethttph
- HTTP 1.x headers (used byhttp
as well)
This is actually a surprisingly rich selection of packet types (although with a distinctly 90s vibe). Each of these packet types has code that checks if the packet is complete or if more bytes are needed. The packet parser is actually used in 2 places, in the TCP receive path, and in erlang:decode_packet/3 which takes a packet type, some binary data, and some packet options. Thus you can decode from a TCP (or TLS) socket or from a file or from memory.
Now, as you’ll no doubt have noticed, this is a fairly arbitrary selection of protocols. For example websockets (which has a framing mechanism) is nowhere to be found, likely because it was invented long after 1995. Similarly none of the protocols I mentioned above appear, which is not surprising.
Having hit the limits of Erlang’s packet parser in the past, I finally decided yesterday to try to support a new packet type. However, I didn’t want to add just any packet type, but rather a way to describe many common binary framing schemes so I could support yamux, mplex, UBX and anything else that was relatively simple (websocket framing is more complicated so it’s beyond what I’ve implemented below).
The result I came up with can be found here
It enables functionality like this:
4> erlang:decode_packet(match_spec, <<16#deadbeef:32/integer-unsigned-big, 2:16/integer-unsigned-little, "hithisisthenextpacket">>, [{match_spec, [u32, u16le]}]).
{ok,<<222,173,190,239,2,0,104,105>>,
<<"thisisthenextpacket">>}
And more broadly things like this:
test() ->
{ok, LSock} = gen_tcp:listen(5678, [binary, {packet, raw},
{active, false}, {reuseaddr, true}]),
spawn(fun() ->
{ok, SSock} = gen_tcp:accept(LSock),
gen_tcp:send(SSock, <<16#deadbeef:32/integer, 2:8/integer, "hi",
16#c0ffee:32/integer, 3:8/integer, "bye">>),
timer:sleep(infinity)
end),
{ok, S} = gen_tcp:connect("127.0.0.1", 5678, [binary, {active, true},
{packet, match_spec}, {match_spec, [u32, u8]}]),
io:format("connected~n"),
receive
{tcp, S, <<16#deadbeef:32/integer,Length:8/integer, Data:Length/binary>>} ->
io:format("Got data ~p~n", [Data]) %% Data is 'hi' here
end,
receive
{tcp, S, <<16#c0ffee:32/integer,Length2:8/integer, Data2:Length2/binary>>} ->
io:format("Got data ~p~n", [Data2]) %% Data2 is 'bye' here
end.
Essentially it allows you to define a list of fields (available types are u8
,
u16
, u16le
, u32
, u32le
and varint
) the last of which is the payload
length field. Thus the yamux spec would be [u8. u8, u16, u32, u32]
and the
mplex spec would be [varint, varint]
. Annoyingly the UBX protocol doesn’t
work with this scheme because 2 checksum bytes appear after the payload, but are
not included in the length. I will try to think of a way to support this
relatively common pattern as well. Perhaps something like [u8, u8, u8, u8, u16,
'_', u16]
and have the _
indicate the variable-length payload immediately
following the length byte (non-payload-adjacent length fields is probably
pushing the limits of what this feature should do).
So, how the hell does all this work? Well, it’s remarkably complicated and has
to touch some rather gritty corners of the BEAM. Essentially, as noted above,
there’s 2 ways to invoke the packet parser. Decode packet goes through
erl_bif_port.c
which implements all the built-in-functions (before NIFs there
were BIFs, but only OTP was allowed to implement them) for dealing with ports.
Like NIFs, BIFs get passed some C version of Erlang terms which they have to
destructure and interpret to control the behaviour of the C code. Annoyingly,
this is not the same enif API as NIFs
use; it appears to be some distant ancestor of it. Anyway, once we’ve parsed the
arguments to erlang:decode_packet and decoded the options, we call
packet_get_length
which returns -1 on error, 0 on ‘not enough bytes’ or a
positive integer (that is the length of the packet)
when it has a complete packet for whatever the selected packet type is.
This is the simpler path.
For sockets, we first have to traverse gen_tcp which yields the parsing of packet options to inet.erl , which quickly calls into prim_inet which constructs the actual port commands to the inet_drv port. In Erlang, ports are essentially sub-programs that communicate with the host BEAM via (usually) stdin/stdout/stderr (or other file descriptors). Sometimes, in the case of the ODBC port, the port opens a TCP connection back to the BEAM for performance. Ports are one of the oldest mechanisms the BEAM has for interoperating with the operating system or underlying hardware, and their process isolation means they remain the safest.
However, because data now has to cross a process boundary, we have to
marshal/unmarshal it to get it across. Again, inet_drv probably predates
erl_interface
which provides some nice support for this (including a way to
un-marshal the erlang binary term format) and it does all its communication with
a fairly simple binary ‘protocol’. Essentially each ‘command’ is prefixed by
some kind of INET_OPT
shared constant followed by some optional data. For
example setting the reuseaddr is done via the INET_OPT_REUSEADDR
constant
(defined as 0). prim_inet handles turning {reuseaddr, true}
into something
that looks like <<?INET_OPT_REUSEADDR:8, Value:32/integer>>
and sending it
down to inet_drv where it is parsed in a giant switch statement and then somehow
actually applied using setsockopt.
This is mostly fine, although the big snag is the prim_inet
module is special
in that it’s
preloaded.
Preloaded modules are BEAM bytecode that is
essentially compiled into the BEAM when the BEAM is built and cannot be reloaded
or changed without rebuilding the BEAM. Even more interestingly the preloaded
modules are not normally compiled when you build OTP from source, the OTP
distribution, and the git repo, contain the precompiled beams. If you wish to
perform the dark-art of recompiling a preloaded beam you must use make
preloaded
, which re-compiles any changed preloaded beams (but does not put them
in the right place for the BEAM build process to pick them up). If the
compilation looks like it worked, you can then use ./otp_build
update_preloaded
which will recompile the preloaded beams and put them in the
right place (note that this will recompile ALL the precompiled beams and also
make a git commit on your behalf(???), so use with caution). You can also simply
copy the beam file you’ve recompiled into the right place by hand.
Precompiled beams also have some restrictions. For example you probably don’t want to call io:format() from inside them, because precompiled beams can run before the BEAM is fully booted and some things like the io service might not be available yet. Happily debug macros are provided to ease the pain a bit.
So, to get my new packet type and options to work, I had to work my way down through the layers of parsing, serialization, deserialization and usage to actually get my new options to make it all the way to inet_drv’s use of the packet parser. This was not easy, and I might not have done it the right way, but I eventually did get it to work.
To summarize, in less than a day’s work and less than 200 lines of (only
somewhat horrible) code I was able to add what I think is a useful feature to
Erlang despite having touched hardly any of these parts of Erlang system before.
I hope to clean this up some more and submit it to the OTP team for inclusion. I
will probably change the name from match_spec
to packet_spec
or something
and maybe try to support the UBX use-case better. I don’t know how much longer
inet_drv will be around (the file driver was rewritten to be a NIF that uses
dirty schedulers for OTP 21, maybe the inet driver is next?) but maybe we can
think about keeping the idea of powerful packet parsing down in the VM and
evaluate approaches like this to make it more flexible (and less 90s themed).
Longer term it might be nice to have something like BPF programs you pass down
into the packet parser, but that would be a lot more work.
Finally, I’d like to thank Marc Nidjam for pitching in on the varint support and the tests (not all his code is in there yet). Any other suggestions or assistance is most welcome.
Of communities and bikesheds - 12 Feb 2018
So, this morning a new Erlang package building tool was announced. I happened to be reading the erlag-questions mailing list (a fairly rare occurrence, as we’ll get into) and I saw the announcement. As soon as I saw the name of the project, I decided to ignore the thread. However, that thread soon re-connected with me via 2 IRC channels, a Slack channel and Twitter. The project’s name? Coon.
Now, having grown up in Ireland, I was unfamiliar with the word, or the racist connotations. Only since moving back to the US have I been introduced to the surprisingly large lexicon of American racism that was not mentioned in ‘To Kill a Mockingbird’ or ‘Huckleberry Finn’. Thus, given that the author didn’t seem to be a native English speaker, and certainly not someone expected to be familiar with derogatory American slang, I expected someone to politely point this out and for the author to realize they’d made a terrible mistake and rename it.
Well, at least the first part happened.
About now is the time to mention why I don’t regularly follow the erlang-questions mailing list anymore. Many years ago, when I was new to Erlang, I was an avid reader of the mailing list. However, over time something changed. I’m not sure if I simply became proficient enough with the language or if the tone of the mailing list changed as the community grew, but I began to lose patience with the threads on naming and API design that would always grow out of all proportion to their importance while deep, technical discussions would often be overshadowed. For the most part this was just annoying, but harmless and I gradually drifted away from paying close attention to it.
Today however, things are a little different. There’s yet another naming discussion, and people are adding their opinions to a dog-pile of a thread faster than you can read the responses, but this time it’s about the accidental use of a racist slur as a project name.
Now, let’s remember, this is a programming language community. These communities are supposed to help practitioners of the language, advocate for its use and generally be a marketing and outreach platform to encourage people to use it. There are a lot of programming languages these days and developer mindshare is valuable, especially for an oddball language like Erlang. And while it is true that communities are not always (or maybe even often) inclusive or welcoming, surely programming communities should be.
Instead the thread (and I confess to having not read the bulk of it) devolved into arguments around intent vs effect and appeals that other problematic project names had flown under the radar in the past. I’m sorry, but this is not how it works. When you create something and release it into the world, you lose control of the interpretation that thing takes on. I’ve seen cases of authors, reviewing their work in a school curriculum where their work is analyzed vehemently disagree with the interpretation of their creation. It’s easy to forget that building things, naming things, etc are as much, if not more, about the effect produced in the consumer of that work as it is about the author’s intent. You don’t get to say “That’s not what I meant” when someone points out a problem with what you’ve done; you need to examine the effect and determine if you feel you should correct it. This is your responsibility as a member of a community and if you’re hurting inclusively or diversity then you are not being a good member of that community.
When I visited ‘coonhub’, the associated website for the tool that lists available packages, I saw one of my own projects prominently featured. Given that I am not a member of a group to which the derisory term applies, I didn’t expect to feel anything, but instead I felt ashamed that I, however indirectly and involuntarily was lending support to this. I can’t imagine what it feels like for someone to whom the slur has been applied, but the faint echo I encountered was unpleasant enough to give me pause.
Long story short, I hope the Erlang community can pull its head out of its ass long enough to realize that bikeshedding about something like this is bordering on the obscene and should shut that shit down. The original author should recognize their mistake, sacrifice their beloved ‘coonfig.json’ pun, rename the project and everyone should move on. A 50 email thread on the matter is ridiculous and is not appropriate.