Laterna Magica: MtG programming: Replication

Showing posts with label Replication. Show all posts

Friday, August 9, 2013

Multiplayer-TicTacToe

If you read my previous post and/or looked at my GitHub account, you already know that I'm programming a small game of TicTacToe to demonstrate and debug my state synchronization library, Harmonic. At first, I had two engines, where one actually played the game, and the other simply listened. Both engines were located in the same JVM, so identifying them was easy.

At that time, I didn't even have states yet, so the communication exchanged the actual action. This was a problem, because although both engines executed the same actions, there was no way that the application could really know. Only later, when the communication consisted of states, this could really be called synchronization.

The real fun began when I added a notion of branches. A branch points at a state, and can move to other states as well. An engine can publish changes to a branch, and the receiving engine will then ask for the missing states. Branches also make it possible to delve into alternate histories, for example for AI.

Although states and actions are defined as protobuf messages, nothing I described so far contained any networking or IO code. The BranchManager class contains three methods that do communication, but all they do is call a callback method and leave the actual networking to whatever the implementation of that callback sees fit. I added an implementation using JGroups; the same library I used in my last attempt.

And now, I have a graphical multiplayer Tic Tac Toe. Granted, it does not enforce that every client plays only his own pieces, but that's not the point and would be easy to do; it just requires a more sophisticated new-game negotiation. What does work out of the box is spectating. A newly started client does not yet ask for the current state, but as soon as one of the playing clients performs an action, the spectator is also updated.

Right now, to try it, you have to build Polybuf, Harmonic and TicTacToe yourself, which is not optimal. TicTacToe is meant to help me experiment, find out what's good practice and what needs improvement. When it comes to distribution, applications have different requirements than libraries, and fulfilling these requirements is the next thing on my schedule.

Okay, that's fixed now! I have added a git repository that contains a Maven repository. That means you can reference my libraries (for example, in building TicTacToe), and the dependencies can be downloaded from there, without the need for you to build them yourself.

The repository also contains a TicTacToe-0.1.0-jar-with-dependencies.jar, which... well. It also has a MainClass attribute, so it should be reasonably double-clickable. I have only tested it on one machine, and I don't know how the default JGroups-Stack behaves in LANs and WANs, but fixing issues arising from that is a matter of passing a configuration to a constructor...

Thursday, August 1, 2013

Harmonic: Game replication and git?

So I took some time off of trying to implement game replication. Instead, I did some other stuff (like university), and gathered some experience with git. With time, I realized that git, and distributed version control in general, provides exactly those concepts needed for the model of replication I wanted. Looking back at an earlier post on that topic, I can't really say how I expected it to work out. In the end, it turned out that transmitting "unfinished transactions" was a really bad idea; in the end, the very essence of a transaction is that it executes either as a whole or not at all.

Well, anyway, now I had experience with git, and it contained exactly those concepts I needed for my replication as well:

a repository of files (objects) in which changes are tracked
a history of commits, where commits have a parent/child relation. The main attribute of that relation is what changes happened between those commits
branches which point at specific commits and provide a comfortable way of traversing and manipulating the history
definition of remote repositories for synchronization of branches

In contrast to the transaction approach I pursued earlier, where the parent/child relationship is in regard to application logic and does not really add to the synchronization capabilities of the framework, this version control approach is based around a tree of commits, where the parent/child relationship is a temporal one, and essential to the framework. The framework is less cluttered with concepts that are not important to it.

Of course, a few things are different: a git repository manages files, essentially byte sequences, while a replication framework manages objects. While these could be interpreted as byte sequences, it doesn't really add to the usability, because objects are usually manipulated on the level of fields and methods. Furthermore, files are organized in a file system with human-chosen names for every single file. Objects simply reside at some location in memory not even the program can control; some other sort of identification is needed. And finally, for file versioning, a human user can try to resolve conflicts. For replication, this is not possible. Conflicts can't be resolved automatically, and as different object store different data, even manual merging could result in inconsistent state.

Luckily, besides the last problem, these can be solved. And for the last problem, usually it shouldn't be a problem in the first place, at least when talking about Magic. Magic is not a game of reaction time, where all players try to do something at the same time, and in the end someone was the earliest. Magic's system of priority defines very clearly which player may take an action at what point, so the game logic can ensure that conflicts do not arise, simply by obeying the rules.

You can take a look at Harmonic if you want. I'm not happy with the name, but I couldn't come up with anything better. I wanted it to sound like multiple independent systems are in sync, like vibrating at the same frequency, or looking the same; I don't think I succeeded, but that's okay as long as it works.

An Entity compares to a file. Entities are objects managed by Harmonic and are identified by a numeric ID. IDs are created by simply incrementing a counter, so creating IDs in different instances of Harmonic is consistent.

An Engine compares to a repository; it is a store of entities. An Engine also manages the different states (analogous to commits), and its current state: in a repository, the files always reflect one specific commit. In the engine, the entities always reflect one state, and the engine knows how to move between the states. Engines are identified with random IDs; as engines are generated on different machines, there is no way to ensure that engine IDs are always different. So, random IDs are used.

States, as mentioned above, compare to commits. Git uses random commit IDs - at least they look random. In Harmonic, the engine ID is part of the state ID. Assuming the engine IDs are different, it's enough to simply increase the ID for each subsequently created state.

Last but not least, Actions and Modifications correspond to the changes that make up a commit. A modification is a single change that is easily reverted, like setting a variable (or creating a new entity ID). An action is a sequence of modifications, so the action can be reverted by reverting the modifications making it up.

In this model, states, containing actions, are transmitted between engines, but modifications are not. Actions are defined by the application, so they may be as small as needed and as big as possible. The modifications making up the action are only created during execution, so they don't need to be transmitted, making large actions more efficient. For example, instead of transmitting that a card is tapped and mana is added to a mana pool, it's now enough to transmit that a mana ability of a land was played.

I like the direction where this is going. I think distributed source control is an elegant concept, and I don't think I did a bad job at catching its essence, so this should turn out to be pretty usable. In fact, while developing Harmonic, I also wrote a game of Tic Tac Toe that uses it, so I have a proof of concept that Harmonic really works as intended in a turn based game.

Tuesday, January 10, 2012

Entities and values

As so often in programming, one big question arose when I worked on my Undo/Replication framework: What is an entity, and what is a value?

I use the word entity to make clear that I mean something different from objects. An object is something identified by an address pointer, whereas for a primitive value, its bytes in memory directly represent the value instead of an address. What I mean with entity/value is best explained by an example from databases:

Suppose you have a database table storing the data of a company's employees. There is a surname, a prename, a birth date, and so on. Besides others, there's an address. The simplest approach is to handle each of these items as a table column: Surname and prename are strings, the birthdate uses a special date column type. The address can also be simply taken as a string, but there are other ways:

An address is composed of a street name, a number, probably also an appartment number, country, state etc. It could be beneficial to store all these data in a table on its own, give it an ID (the database way of saying address), and only reference this ID in the employee table.

Don't forget the third way! You could use all those columns, but put them directly into the employee table, so there would be prename, surname, birth date, street, stree no., appartment no., etc.

What's the best? Of course, it depends! Variant one is easy to use, but lacks intrinsic validity checks: When the database doesn't know that something should be an (appartment) number, how should it check it's even a number? Variant two can handle a very specific case "better": two workers have the same address. In this case, the same ID can be referenced multiple times, so that both employees not only have equal addresses but also have the same one. In this particular case, the benefit is not too big, but there are use cases where it's important that something can be referenced. About the third, it really combines the downsides of the both above... I personally see no benefit in using it. It there are, I'm sorry I didn't see them.

Now, what should have triggered your attention was the word "ID". As soon as something has an identity, it's an entity. Note how the prename is a String, which is an Object type in Java, but only a value. On the other hand, the address in case 2 has an identifier and is an entity.

And now we're back: What I do is to assign IDs to the objects I want to replicate. The objects have values which can be changed, and I transmit changes like "property XY of object 01 changed from 'ab' to 'cd'."

And now the big question: what about collections? Should it be "element 'ab' was added to list 01" (i.e. they are entities) or "element 'ab' was added to list XY of object 01?"

Saturday, January 7, 2012

Replicating Game States - JGroups

So now I've got it. My goals in implementing Replication was to keep it out of the core functionality while still making it easy to use without worrying about replication yourself. How do you do that? By providing a general interface to other software (in my case, that's a listener) and implementing a specific one for the task at hand.

So my History class, which stores all changes to the managed objects, has a listener for two possible events: A new modification is executed; and the "current" state is moved somewhere else (e.g. undoing something.) My replication listener does not support undo right now, because I haven't figured out how to best identify a state (which is where the program is going) yet, but that should be easy. The other thing, executing modifications, does work. That means, whenever a modification is executed locally, it is sent over the network so that the partner(s) can execute it too.

The only downside is that the partners also "execute it locally", so I had to use a trick to suppress resending these received modifications. And it works pretty well:

//initialization code
final History h = createHistory(createKey("test-history"));
final JChannel channel = new JGroupsReplicationListener(h) {
    @Override
    public void receive(Message msg) {
        Modification m = (Modification) deserialize(msg.getBuffer());
        if(m instanceof Creation) id = ((Creation) m).getId();
        super.receive(msg);
    }
}.getChannel();

//executing the actions (creating an object)
h.pushHistoryForThread();
try {
    TestBean t = (TestBean) h.getObjectStore().get(Creation.create(new TestBean()));
    t.setA(1);
    t.setB(1);
} finally {
    h.popHistoryForThread();
}

//printing the results
System.out.printf("current state: %s%n", h.getObjectStore().get(id));

I made this test a GUI application so that I can control the timing on multiple virtual machines. The public void receive(Message msg)seen above is only needed for this Test, again. It's not necessary to do something like that to get the replication. The code in the middle looks exactly the same as the example last time, except I haven't hidden some of the details in a setTest() method.

Now what's probably the most interesting is how the networking works; let me tell you, I didn't write any! As hinted by me before, and in the title of course, is that I have tried JGroups to do this. In JGroups, you create Channels and let them join into a cluster; all channels in the cluster can receive and send messages, either to only only one recipient, or to all at the same time. JGroups provides a configurable protocol stack that takes care of reliability, synchronity and such things. For my tests, I have used a UDP-based protocol stack. For wide-area connections, a TCP-based approach is probably easier.

Thursday, January 5, 2012

Replicating Game States - Working code!

Before you get too excited - no, there's no multiplayer yet (well, who would have guessed that...). There is, however, working code for undoing and redoing actions (even with multiple, branching histories), and even for replicating the state on multiple virtual machines.

The key to make such a library is that it looks like ordinary Java code from the outside, so that others can use your code without wondering why it looks so weird. Let's take a look:

//Test t = new Test(); is an attribute

//replaced angle with square brackets because of HTML markup
List[StateStamp] states = new ArrayList[StateStamp]();
History h = createHistory(createKey("test"));
h.pushHistoryForThread();
try {
    states.add(h.getCurrentState()); // 0
    print();

    setTest(t);
    states.add(h.getCurrentState()); // 1
    print();

    getTest().setA(1);
    states.add(h.getCurrentState()); // 2
    print();

    getTest().setB(1);
    states.add(h.getCurrentState()); // 3
    print();

    Deletion.delete(getTest());
    states.add(h.getCurrentState()); // 4
    print();

    h.goToState(states.get(3));
    print();
    h.goToState(states.get(2));
    print();
    h.goToState(states.get(1));
    print();
    h.goToState(states.get(0));
    print();
} finally {
    h.popHistoryForThread();
}

It's probably easy to to guess that print(); prints the current state of the test object. Besides that, the only code that looks different from normal operations is marked in bold. Most of it is only necessary for this test case: storing the previous states so that you can go back. Really mandatory is only the colored code (and the actual functional code of course, but that doesn't count.)

Of course, a history for the modifications must be created. Then, the history is set for the thread: Code running in the thread (that is, everything in this method, and nothing more) can access the history without passing it as a parameter (which would make using the library awkward) or declaring it as a static variable somewhere (which is ugly from an architecture standpoint.) Everything after this is wrapped in try/finally, so that the last method is not skipped in case of an exception: this removes the association of the history with the current Thread.

Of course, implementing setTest(), setA() etc. is different from a usual simple setter. But the point here is that using the code is as simple as shown here. The result is this:

Test@732A54F9[a=0, b=0] not in store
Test@7A6D084B[a=0, b=0] in store
Test@7A6D084B[a=1, b=0] in store
Test@7A6D084B[a=1, b=1] in store
Test@7A6D084B[a=1, b=1] not in store
Test@15301ED8[a=1, b=1] in store
Test@15301ED8[a=1, b=0] in store
Test@15301ED8[a=0, b=0] in store
Test@15301ED8[a=0, b=0] not in store

You see here how the test object runs through different states up to its deletion, then goes back to its initial state. What's not shown here, but is working, is adding another branch to this history. Of course, the actual objects have only one state at a time, but the history can be switched arbitrarily between multiple histories:

This is the actual test case I ran: I went back to the state after creating the test object, changed its a value, and then switched over to the state before, and then after, deleting the test again.

I have code for replication of states going, but it's not yet as clean as for undo: a lot of handling is necessary for the network, which should be transparent to the user. I'll show it once I'm finished, but in the meantime you can look here.

Saturday, December 31, 2011

Replicating Game States - revised

The last time I talked about my Undo system, which should also allow multiplayer through its structure of storing actions that can be transmitted over the network, was a long time ago. I don't want to directly talk about this today; you'll see.

Especially one thought has gone through my mind in the last weeks: Replicating a state across the network properly is nothing specific to Magic - so why do I implement it in the Core of Laterna Magica? I thought about it, and there doesn't seem to be a specific reason for doing so. Another result of that thought was that there are probably complete libraries out there which do exactly what I want. Well, that's unfortunately not the case, so I'm stuck/blessed with implementing it myself.

I haven't really gotten into development yet. I have a rough skeleton, but no really fancy functionality yet. But I'm having my thoughts. I think stepping away from the application's needs and looking at the general requirements of such a library help in developing a better piece of software. I haven't done this for my current undo system, which I find kind of disappointing. Normally, I don't have problems with spotting such generalizations.

Okay, back to topic: what does a library for state replication need?

There needs to a notion of what has happened on a higher, application level: Transactions
Transactions consist of atomic modifications, which are the core of the state

Optionally, they may also consist of other sub-transactions

Transactions and modifications must be transmittable over the network. If the whole state is represented as a tree of subtransactions under a root-transaction, then there's of course the need to transmit unfinished transactions. Otherwise, it might be enough to transmit finished transactions. I think that the first is generally more flexible. It might be a good idea to give the user a chance to mark transactions as atomic so that it's only transmitted when it is finished.
Probably the most interesting requirement is that, of course, after transmitting the state, both/all states have to be the same!
The system should be able to detect that there are state conflicts and that a transmitted transaction can't be carried out locally.

When we're already speaking of it, it's probably nice (but also probably impossible, so see that as an utopic remark) to resolve/merge such conflicts.

Now guess what I have an example for! Of course for the "most interesting" one in my list! It might seem at first that by defining modification, and making them granular enough to catch all actions in a system, there is no way that states may get out of sync. That's unfortunately not true. Let's take a simple case. This case actually only has a theoretical problem, but it's easier to illustrate and serves the purpose. Our state consists of two variables, foo and bar. The user may only set foo, and when he does so, the program automatically changes bar. Now consider this:

When the state replication system transmits these two modifications, the receiver does its own set operation on bar, because foo has changed. Then, the original set operation is transmitted, setting bar another time (fortunately to the same value).

Until now, I thought that this particular case is just me wanting to be clean. Now that I wrote this, I even see more problems, real ones: If we're able to detect conflicts, then this shouldn't even run through, because the last modification should be the second but really is the third. And even if we don't: this second modification in State 2 needs to be transmitted to State 1. Even if that doesn't result in errors, which is likely, it will waste network bandwidth at least.

Now for the real problem I spotted here before: Let's replace "bar = 1" with "put a token onto the battlefield". Setting a number multiple times is redundant, but creating an Object is not. There is a very definite difference between one and two tokens, you know!

Now, let's solve these problems. There are basically two ways that can lead to this problem: a complex setter, and events.

In the first case, the setter for foo includes code that calls the setter of bar. The problem here is that our replication library can't know that setting foo doesn't really do this: it actually sets foo and bar. So, refactor foo's setter so that it does only what it says, and not more. Hide this from the user by adding another method doSomething() that does what he previously did by setting foo.

Second is events. In this case, the change to foo fires an event that leads to bar being changed as well. In this case, the simple answer that is not so simple to implement is: don't fire events while replicating the state. The more complex answer that is simpler to implement is that this is similar to the first case: Don't write events that fire on state changes and perform state changes. Instead, fire a new Event in doSomething() that triggers the change. As the state replication won't call doSomething(), everything is fine.

Both solutions shown here require changes in the application, not in the library. This might seem bad, but it's actually logical: The library may sanely assume that variables change and objects are created, and that it may do that by itself as well. If the state does not fulfill the contract that it just contains values, but instead employs complex logic that the library can't predict, the task is impossible. The library needs some way to change values without triggering application logic.

Thanks for staying wiht me, and a happy new year to all of you!

Laterna Magica: MtG programming