I’m currently implementing a web service providing the social features of a game. One of this game feature is the ability to manage a friends list. The friends the user can add depends on the contacts he’s having on an external social network of his choice (currently Facebook or Twitter).
The current behavior of the system is the following:
- The client application uses the social network (Facebook or Twitter) API to retrieve the contact list of the player.
- Each of this contact is provided with a unique identifier (namely, the social network he’s originating from, and its identifier on this social network, for example “Tw12345”).
- The client sends the list of all those identifiers to the game web service hosted on GAE.
- The web service checks for each identifier if it has a user that matches in his own database.
- It returns a list of identifiers, filtered to contains only those who also have a match in the game database.
It obviously doesn’t work well, because most users contacts list are huge. The server is spending a tremendous amount of time checking the database to filter which contact have a matching game account.
Now, I’m having a hard time figuring out how I can proceed more efficiently. As the identifiers aren’t following any given order, I can’t use integer operations to select users on the database. Also, I can’t rely on Twitter or Facebook to do the filtering on their side, because that’s not supported by their API.
I thought of a system using some kind of memcached data tree to store a list of “known” identifiers (as the query only needs to know that there’s a matching user, not which user exactly is matching), but I’m afraid of the time the cache will take to build up anytime it gets cleared.
If any of you have an experience on this kind of set-related trouble, I’ll be very happy to hear it! Thanks!
I assume it’s so slow because you’re doing a query for each user you’re looking up. You can avoid the need to do this with good use of key names.
For each user in your database, insert entities with their key name set to the unique identifier for the social network. These can be the same entities you’re already using, or new ‘index’ entities created just for this purpose.
When sent a list of identifiers, simply do a bulk get operation for all the key names of that entity to identify if they exist – eg, by doing
MyKind.get_by_key_name(key_names).