Friday, January 28, 2011

More App Engine optimization Goodness

So I was recently trying to improve performance on the application and used two tricks to great effect that I want to share with everyone.

First I had some code in one of my tasks like this:

def doTask(self):
    train = NTrain.get(self.request['trainkey'])
    # Do some work
    # This call is a ReferenceProperty and does a get
    player = train.player
    #Do Some work with the player


The problem here is we are doing two gets when we could do 1 get, especially when all NTrain entites are parents of their player. So instead we can do this:


def doTask(self):
    trainkey = db.Key(self.request['trainkey'])
    playerkey = trainkey.parent()
    train, player = db.get((trainkey, playerkey))
    #This will keep you from redoing the get if you use 
    #train.player later
    train.player = player
    #Do all your work here


The second optimization I did was to completely remove the need for a write. In the past my ranking code that ranks players would query players that had a flag set for needing ranking done. Next I would call the ranking code, and then clear the flag for each of the players. Finally I would write all the ranking in a batch and then I would write all the players in a batch.


class Player(db.Model):
 flag = db.BooleanProperty()

def requireRanking(self):
flag = True


class SetRanks(BaseController):
def handleranks(self):
        playerquery = model.Player.all()
        playerquery.filter("flag =", True)

        players = []
        for player in playerquery:
            #Do Work
            player.flag = False
players.append(player)


db.put(players)


Now what I do is set a timestamp instead of a flag when ranking needs done. Now I query for players whose timestamp is greater then the last time the task ran. Now I would write all the ranking in a batch, but there is no longer a need to clear the flag and write changes to the player.

class Player(db.Model):
    lastMoney = db.DateTimeProperty()

    def requireRanking(self):
        lastMoney = datetime.datetime.now()

class SetRanks(BaseController):
    def handleranks(self): 
        lastRankCheck = datetime.datetime.now() -datetime.timedelta(minutes =5)
        playerquery = model.Player.all()        
        playerquery.filter("lastMoney >", lastRankCheck)
        for player in playerquery:
            #Do Work
There are many options for how to keep track of the last time the query was run. In my case knowing it is run every five minutes and making that assumption was OK because of the non-critical nature of the data. In a more critical environment you may want to store the last run time in the datastore. You could also pass it forward from task to task (Having the last task create an instance of the next task with the time it ran as a parameter).

No comments:

Post a Comment