First I had some code in one of my tasks like this:
def doTask(self):
train = NTrain.get(self.request['trainkey'])
# Do some work
# This call is a ReferenceProperty and does a get
player = train.player
#Do Some work with the player
The problem here is we are doing two gets when we could do 1 get, especially when all NTrain entites are parents of their player. So instead we can do this:
def doTask(self):
trainkey = db.Key(self.request['trainkey'])
playerkey = trainkey.parent()
train, player = db.get((trainkey, playerkey))
#This will keep you from redoing the get if you use
#train.player later
train.player = player
#Do all your work here
The second optimization I did was to completely remove the need for a write. In the past my ranking code that ranks players would query players that had a flag set for needing ranking done. Next I would call the ranking code, and then clear the flag for each of the players. Finally I would write all the ranking in a batch and then I would write all the players in a batch.
class Player(db.Model):
flag = db.BooleanProperty()
def requireRanking(self):
flag = True
class SetRanks(BaseController):
def handleranks(self):
playerquery = model.Player.all() playerquery.filter("flag =", True)
players = []
for player in playerquery:
#Do Work
player.flag = False
players.append(player)db.put(players)
Now what I do is set a timestamp instead of a flag when ranking needs done. Now I query for players whose timestamp is greater then the last time the task ran. Now I would write all the ranking in a batch, but there is no longer a need to clear the flag and write changes to the player.
class Player(db.Model):
lastMoney = db.DateTimeProperty()
def requireRanking(self):
lastMoney = datetime.datetime.now()
class SetRanks(BaseController):
def handleranks(self):
lastRankCheck = datetime.datetime.now() -datetime.timedelta(minutes =5)
playerquery = model.Player.all()
playerquery = model.Player.all()
playerquery.filter("lastMoney >", lastRankCheck)
for player in playerquery:
#Do Work
#Do Work
There are many options for how to keep track of the last time the query was run. In my case knowing it is run every five minutes and making that assumption was OK because of the non-critical nature of the data. In a more critical environment you may want to store the last run time in the datastore. You could also pass it forward from task to task (Having the last task create an instance of the next task with the time it ran as a parameter).