Monday, September 2, 2013

Next Step: Uno

I'm posting this here so that I have one reason more for not slacking off: the next thing I'm going to try is to make a game of Uno based on Harmonic. Why another dummy instead of Magic? Because, while Uno is of course much simpler than Magic, they share a few features that make it interesting as a prototype. And as Uno is much less complex, it's faster to spot where there are still features missing.

So what are these similarities? I identified the following few:
  • it's a card game, duh
  • unlike TicTacToe, Uno can have more than two players
  • the turn order in Uno can change
  • there are multiple zones in which cards can reside; the deck is hidden, the discard pile is public, and the hands are private to each player
  • Uno has randomness for shuffling the deck
  • some cards require additional decisions by the player
I also found that at least two things are very different from magic - at least in regard to Uno's applicability as a prototype: there is no such thing as tokens, i.e. no new objects are introduced into the game after it starts; and there is no significant strategic part to Uno, so testing the feasibility of an AI is hard. Still, Uno should be really easy and still provides a lot of the things I was hoping to find, so that's where I'll go next.

Friday, August 9, 2013

Multiplayer-TicTacToe

If you read my previous post and/or looked at my GitHub account, you already know that I'm programming a small game of TicTacToe to demonstrate and debug my state synchronization library, Harmonic. At first, I had two engines, where one actually played the game, and the other simply listened. Both engines were located in the same JVM, so identifying them was easy.

At that time, I didn't even have states yet, so the communication exchanged the actual action. This was a problem, because although both engines executed the same actions, there was no way that the application could really know. Only later, when the communication consisted of states, this could really be called synchronization.

The real fun began when I added a notion of branches. A branch points at a state, and can move to other states as well. An engine can publish changes to a branch, and the receiving engine will then ask for the missing states. Branches also make it possible to delve into alternate histories, for example for AI.

Although states and actions are defined as protobuf messages, nothing I described so far contained any networking or IO code. The BranchManager class contains three methods that do communication, but all they do is call a callback method and leave the actual networking to whatever the implementation of that callback sees fit. I added an implementation using JGroups; the same library I used in my last attempt.

And now, I have a graphical multiplayer Tic Tac Toe. Granted, it does not enforce that every client plays only his own pieces, but that's not the point and would be easy to do; it just requires a more sophisticated new-game negotiation. What does work out of the box is spectating. A newly started client does not yet ask for the current state, but as soon as one of the playing clients performs an action, the spectator is also updated.

Right now, to try it, you have to build Polybuf, Harmonic and TicTacToe yourself, which is not optimal. TicTacToe is meant to help me experiment, find out what's good practice and what needs improvement. When it comes to distribution, applications have different requirements than libraries, and fulfilling these requirements is the next thing on my schedule.

Okay, that's fixed now! I have added a git repository that contains a Maven repository. That means you can reference my libraries (for example, in building TicTacToe), and the dependencies can be downloaded from there, without the need for you to build them yourself.

The repository also contains a TicTacToe-0.1.0-jar-with-dependencies.jar, which... well. It also has a MainClass attribute, so it should be reasonably double-clickable. I have only tested it on one machine, and I don't know how the default JGroups-Stack behaves in LANs and WANs, but fixing issues arising from that is a matter of passing a configuration to a constructor...

Thursday, August 1, 2013

Harmonic: Game replication and git?

So I took some time off of trying to implement game replication. Instead, I did some other stuff (like university), and gathered some experience with git. With time, I realized that git, and distributed version control in general, provides exactly those concepts needed for the model of replication I wanted. Looking back at an earlier post on that topic, I can't really say how I expected it to work out. In the end, it turned out that transmitting "unfinished transactions" was a really bad idea; in the end, the very essence of a transaction is that it executes either as a whole or not at all.

Well, anyway, now I had experience with git, and it contained exactly those concepts I needed for my replication as well:
  • a repository of files (objects) in which changes are tracked
  • a history of commits, where commits have a parent/child relation. The main attribute of that relation is what changes happened between those commits
  • branches which point at specific commits and provide a comfortable way of traversing and manipulating the history
  • definition of remote repositories for synchronization of branches
In contrast to the transaction approach I pursued earlier, where the parent/child relationship is in regard to application logic and does not really add to the synchronization capabilities of the framework, this version control approach is based around a tree of commits, where the parent/child relationship is a temporal one, and essential to the framework. The framework is less cluttered with concepts that are not important to it.

Of course, a few things are different: a git repository manages files, essentially byte sequences, while a replication framework manages objects. While these could be interpreted as byte sequences, it doesn't really add to the usability, because objects are usually manipulated on the level of fields and methods. Furthermore, files are organized in a file system with human-chosen names for every single file. Objects simply reside at some location in memory not even the program can control; some other sort of identification is needed. And finally, for file versioning, a human user can try to resolve conflicts. For replication, this is not possible. Conflicts can't be resolved automatically, and as different object store different data, even manual merging could result in inconsistent state.

Luckily, besides the last problem, these can be solved. And for the last problem, usually it shouldn't be a problem in the first place, at least when talking about Magic. Magic is not a game of reaction time, where all players try to do something at the same time, and in the end someone was the earliest. Magic's system of priority defines very clearly which player may take an action at what point, so the game logic can ensure that conflicts do not arise, simply by obeying the rules.

You can take a look at Harmonic if you want. I'm not happy with the name, but I couldn't come up with anything better. I wanted it to sound like multiple independent systems are in sync, like vibrating at the same frequency, or looking the same; I don't think I succeeded, but that's okay as long as it works.

An Entity compares to a file. Entities are objects managed by Harmonic and are identified by a numeric ID. IDs are created by simply incrementing a counter, so creating IDs in different instances of Harmonic is consistent.

An Engine compares to a repository; it is a store of entities. An Engine also manages the different states (analogous to commits), and its current state: in a repository, the files always reflect one specific commit. In the engine, the entities always reflect one state, and the engine knows how to move between the states. Engines are identified with random IDs; as engines are generated on different machines, there is no way to ensure that engine IDs are always different. So, random IDs are used.

States, as mentioned above, compare to commits. Git uses random commit IDs - at least they look random. In Harmonic, the engine ID is part of the state ID. Assuming the engine IDs are different, it's enough to simply increase the ID for each subsequently created state.

Last but not least, Actions and Modifications correspond to the changes that make up a commit. A modification is a single change that is easily reverted, like setting a variable (or creating a new entity ID). An action is a sequence of modifications, so the action can be reverted by reverting the modifications making it up.

In this model, states, containing actions, are transmitted between engines, but modifications are not. Actions are defined by the application, so they may be as small as needed and as big as possible. The modifications making up the action are only created during execution, so they don't need to be transmitted, making large actions more efficient. For example, instead of transmitting that a card is tapped and mana is added to a mana pool, it's now enough to transmit that a mana ability of a land was played.

I like the direction where this is going. I think distributed source control is an elegant concept, and I don't think I did a bad job at catching its essence, so this should turn out to be pretty usable. In fact, while developing Harmonic, I also wrote a game of Tic Tac Toe that uses it, so I have a proof of concept that Harmonic really works as intended in a turn based game.

Wednesday, July 31, 2013

git-flow

Check it out. Do. The "A successful Git branching model" post has, as far as I can tell, gone viral. I stumbled upon it some years ago, when git was almost new to me. I felt that it was a good idea to have some structure in your branching. Intriguing was the idea that, whenever you check out a repository, the head of the master branch is something stable, something you could immediately build and use. You don't have to search the history for an earlier commit where the code did work; if it's on the master branch, it works.

And I wasn't the only one intrigued by the branching model. Someone wrote a set of additional git commands that allows to use the high-level "git-flow" concepts such as "start feature" and "finish release" instead of initial branch and merge commands.

Despite the benefits, I wasn't really comfortable with git on the command line. I'm still not. I feel that for efficient code versioning, a concise representation of changes is essential. I might be hacking along, not sure whether the code is worth committing, before finally saying, "yes". Then, I start up my git tool of choice, SmartGit, and review what changes I actually made, figure out what that means, stage individual changes that are logically related, and commit them with relevant commit messages. My personal feeling is that I'm more productive with a graphical tool.

But recently, SmartGit started to support git-flow out of the box, and I'm happy to be able to finally use git-flow comfortably. Really, go ahead and read the original post and try git-flow or SmartGit. You'll like it, and I'll like it if I check out your repository and have runnable code right in front of me.

I'm back(?) and polybuf

I hope I am!

I was just thinking I was in the mood to write something for my blog. And I have things to talk about. I have some new code online at github; Polybuf and Harmonic might be intersting to you.

Mind you that I don't have any Laterna Magica code online there. I'm not sure about LM's future, besides that it will definitely have one! Looking at the history, it seems it was over two years ago when I added the last bit of real functionality. Somewhere in that time frame, I realized that the whole architecture of LM wasn't able to address all the stuff I wanted to get out of it - and that stifled the fun I had with the project.

But LM was always in the back of my head. I tried to figure out what I wanted, what I needed to do so, and what was fun for me to make at that moment. And here I am, writing a new blog entry, because it's fun to me. I hope it will stay fun to me, so I hope I'm back!


Now to make this post not entirely meta, let's talk about protobuf and polybuf.

Protobuf is a protocol library which is apparently widely used inside Google. You specify a protocol using .proto files, and protobuf creates code to read and write that protocol. It's basically a replacement for serialization, and provides code generators for different target languages, so that programs written in different languages can exchange data.

The protobuf code generator supports options that let you control how the code is generated; without losing protocol compatibility, you can code that is more or less optimized for memory footprint, code size, parsing time. Protobuf is only one serialization replacement library of many, and depending on how you run your benchmarks, you probably get different results as for which is the fastest. Anyway, protobuf definitely has the features I need, whether or not it's the "best", and it's supposed to have good documentation. I can't complain so far.


One thing I didn't like about serialization from the beginning is that you throw the Serializable interface right into your code. Exporting and importing state is a very specialized aspect of a software, and it's likely not directly related to the core classes of your software. The thought behind serialization is that adding the implements Serializable clause immediately enables you to serialize your objects - but that's only true if your object really correspond to whatever you would want to transmit over the network or store on this. Any discrepancies mean that you have to work around this central concept of serialization, often making the code further less readable, and further clogging your classes with logic that does not correspond to their original concerns.

And after you're finished with all this, you have a class that has logic for one specific serialization mechanism. If you want to switch that mechanism out, you have to plumb directly in your application logic classes, instead of replacing some auxiliary import/export classes.

And there comes the fact that Serialization semantics are... funny to begin with. Deserialization does not invoke a constructor of the class to deserialize. It creates an empty object without using constructors and then populates the fields manually from the stream. Deserialization can even modify final fields. But, after that, you yourself can't, even using the mechanisms that deserialization provides.


Protobuf (and its likes) kind of work around these problems. You're not serializing your logic objects, but dedicated messages that have well defined serialization properties. On the receiving end, you can restore your object using whatever constructors and methods you seem fit. That seems like more work, but only in the short run; your benefits are familiar semantics (constructors and methods), separation of concerns, easier reaction to changes (through the separation, and the fact that protobuf was created with the explicit goal of compatibility between protocol versions), and of course improved performance through dedicated code instead of reflection.

One thing, by the way, that protobuf does not support, is polymorphism. The reason is simple: there are other patterns in protobuf that enable the same functionality, and polymorphism is a feature handled differently in different languages (Java has no multiple inheritance, and C obviously has no inheritance at all). Making Polymorphism a feature on that level limits the interoperability of protocols.


And that's where polybuf comes into play, and where I'll finally show some example code. Polybuf is essentially a convention on how to format messages that contain polymorphic content, and a collection of classes that facilitate deserializing these messages. At the core is the polybuf.proto file, showing the relevant parts:

message Obj {
  optional uint32 type_id = 1 [default = 0];
  optional uint32 id = 2 [default = 0];
  
  extensions 100 to max;
}

Proto files contain message definitions. The messages declare fields that may be required, optional or repeated. required is actually not recommended, because it makes changing the message slightly harder. This message features only one data type: uint32. Protobuf encodes these as compact as possible; small numbers will take less than four bytes; big numbers may be larger. For values where big values are expected, there are other fixed length encoding types. Every field has a numeric tag that is used instead of encoding the field name. Fields can be removed and added without breaking compatibility, as long as these tags are not reused - at least on the protobuf level. Of course, the application must be able to process messages where some fields are missing; that's what the default values are for. These are not transmitted. The receiver simply fills in the values if the field was not present (very possible for optional fields).

Below these two fields, there is an extension declaration, and now it becomes interesting: 100 to max specifies that all field tags greater than 100 are free to use for extensions. There is an example directly in the file:

//  message Example {
//    extend Obj {
//      optional Example example = 100;
//    }
//    
//    optional string value = 1;
//  }

The Example message declares an extension to Obj, which is simply a set of additional fields, labeled with tags from the extension range. It has nothing to do with and extends clause, and is pretty independent from the Example message; it just happens to be there for logical grouping. The only consequence from the placement is the namespace used by the extension.

The Example message itself declares a field value, plain and simple. Now, where's the benefit? Imagine there's a class Example in your program and you expect it to be subclassed, but want to still support protobuf serialization. This structure allows you to
  • reference an Example by declaring the field as Obj,
  • create an Example by setting type_id to 100 (the extension tag used by Example), and filling the example extension field with an Example message,
  • create an ExampleSub message that uses a different extension tag, and simply fill both the Example and ExampleSub extension fields. From the type_id field, you can see that it's an ExampleSub object, and since you know the ExampleSub class structure, you know that there will be an Example message containing the superclass fields.
  • The same goes for multiple inheritance: If you know that the type_id-referenced class has two superclasses, simply read both superclass extension messages from the Obj.
Of course there's some Java code supporting this message structure to give a more serialization-like experience. Look at it. Input and Output are similar to the streams you're used to; however, they don't convert objects into bytes and vice versa, they convert between objects and Objs. From there, it's only a small step to networking and persistence. Config stores the IO objects that actually translate between objects and the corresponding extension messages. The Config is reused for subsequent Input and Output uses.

The IO interface is the most interesting one. Its three methods specify clearly the semantics that polybuf expects for (de)serialization, making it easy to understand what's going on. It extracts the logic from private methods in the class itself into a place that is actually meant to handle that concern, and gives the programmer full control over what data is (de)serialized, what constructors and methods should be called. And last but not least - these IO classes support inheritance! Deserializing a subclass can be as simple as subclassing the superclass IO, add your own subclass initialization and persistence, and delegate to the superclass IO for fields that belong to it.