[Twisted-Python] Operating on metainformation in a distributed system

Werner Thie werner at thieprojects.ch
Thu Aug 8 19:38:02 MDT 2013


On 8/7/13 2:08 PM, Krysk wrote:
> Hello. I tried to ask this on StackOverflow, but Glyph advised me this
> would probably be better. I've had trouble phrasing these questions
> well, so please ask me if you need more information.
>
> I'm working on a game server that needs to support 18,000+ players.
> The game requires fairly intense resource use, and I'd like to have an
> upper limit on the playercount much higher than I need. The solution
> seems obvious: Design a server that can scale out and up as necessary.
>
> Because distributed systems are hard, I tried to simplify the design
> so that isn't a concern as much as possible. To that end, my
> architecure is pretty simple. A player is always assigned to a
> GameHandler instance. By default, a player is assigned to a
> Lobby(GameHandler) instance. They can then queue for a match, and when
> an appropriate match is found the server with the least load creates a
> new handler, say, CaptureTheFlag(GameHandler). Then, the servers which
> those players connected to serve as reverse proxies, forwarding all
> data to the CaptureTheFlag handler. When that game ends, those players
> are all returned to their Lobby(GameHandler) instances. Reverse
> proxies are neccessary because I didn't write the game client and
> modifying it is not an option. Connections cannot be renegotiated. I
> can place all the servers in the same LAN, which should prevent any
> major latency issues, and make bandwidth not a problem.
>
> So far, all is good, I think this design will work well and be very
> simple to work on. However, it raises the big, ugly question: How do I
> share metadata across the distributed nodes?
>
> That's necessary for the matchmaking itself. We might have 400 players
> connected across 10 servers, and we want to make a match where there's
> eight players on one, four on another, and four on another. I also
> need to be able to figure out how many players are on the entire
> network, syncronize bans and configuration data, etc.
>
> I was thinking I could use MySQL to store the configuration data, and
> use Redis for the transient data like who's online, who's in queue,
> etc. Then I could have one server dedicated to operating on all that
> data (such as arranging fair matches). I could use some kind of push
> notification to let servers know when a match has started or ended, or
> just have them query Redis periodically.
>
> This doesn't seem very elegant, easy to work with, or easy to
> implement, so naturally I don't like it very much. I'm sure it will
> work, but I was hoping someone could suggest a more natural approach.

Hi Krysk

I had similar design constraints when wanting to match up to four human 
players or computer players playing a card game. In the first monolithic 
approach I got a better feeling for how long match making actually 
takes, today we're seeing seldom more than ten tables being in the match 
make process, while there are up to two thousand user playing cards. The 
matchmaking for the game isn't the fun part, so users do away with it 
pretty fast.

After observing user behavior for more than a year, I spread out the 
game logic to separate game servers with a central matchmaking process, 
maintaining all the metadata, doing the load balancing for the game 
servers and handling the broadcasting of status and activity information 
to players. Metadata stored and passed around is the usual stuff like 
game skill level, likeability, friends, blocked users, number of games 
played, and some. The data is kept in a MySQL DB, is fetched at log in 
and passed around with the player instance.

This scheme so far balances very well and in case of needing to handle a 
lot more users, I would separate the matchmaking process to a dedicated 
machine.

The whole setup for more than 50k games played to the end per day (about 
13mins average play time per game) is handled by an 8 core single 
processor machine with 24GB of RAM, usually we do not run more than 5-6 
game logic server processes. The machine is well balanced, extremely 
stable, no runaway situation was observed since deploying the system two 
years ago.

The bottleneck I foresee in our case is the 100MB/s connection we have 
at the hosting center, currently we are only allowed one interface.

For me dodging the sharing of metadata for the matchmaking was crucial, 
I didn't fear the sharing so much as the latency induced by sharing 
metadata among processes or machines, because the added latency adds a 
lot more incongruous stuff happening to the user's experiences. Match 
making on screen with manually selecting partners puts quite a strain on 
the imagination of the average user, with added latency to clicks and 
and answers, the users shy away from match making and start playing 
alone or with the much easier selectable computer players.

HTH, Werner




More information about the Twisted-Python mailing list