Monday 2 June 2008

Sessions in Google App Engine

Google App Engine was a bit of a shock to many application server developers - no sessions.

What is a Session? Every browser request stands alone. The only connection between then are cookies passed back and forth between server and browser. Early CGI used these cookies to hold all important data. This is limited in size and does not allow for any security. Application servers usually only set one cookie - a reference to a session object. Every request from the browser can then be associated to a server-side session, including being logged in.

For efficiency on a single server these sessions are kept in memory. For small clusters the network makes sure that a session consistently accesses the same computer.

With the Google App Engine you do not know which server runs the application and which database store holds the data. This gives us massive extensibility, the vaunted Google speed advantage, redundancy, reliability and much more. The cost? Well, we can't hold a session in memory because different machines could very well serve different requests.

Why We must have a Session. Surfing the web is like reading a book. We hold the context of the context thread in our heads as we read. Using an application server is more interactive - like a conversation. A conversation requires that both participants hold the context so that it can exceed a single exchange.

The Browser Session. All modern browsers hold a session using cookies. Cookies are associated with a particular web site or path on a web site. They are held by the browser and passed back with reach request. Both browser and server can set new cookies. In-memory cookies only last until the browser is closed. Cookies can be persistent, but since they are part of the browser they are specific to a single computer. For safety cookies can have a time-out after which they are removed. Because cookies are sent back and forwards with every exchange in the conversation they are limited in size. Cookies are great when used within their limitations.

The Google Session. Yes, I know that I said Google App Engine did not provide session management. This is not entirely true. It does provide a Users API. And guess what - it is reference by a cookie. Any Google App Engine code can pick up a small amount of user specific data - name, email address and a nickname to display. It is almost certainly kept in the same data store as our own data, but it is likely to be optimised.

A Session we can use. Because I needed connectivity early on I used some of the earliest examples. There has been a lot of session discussion on the forums since, but as I have an acceptable solution I have stopped following them in detail. Because data retrieval is expensive I wanted lazy loading.

My solution was a class (session.py) that I add a reference to in the parameters from any Django template:


params["session"] = Session(request)


Session is not persistent - it does not inherit from db.model. The idea is to keep it light-weight until something is needed.


class Session:
def __init__(self,request)
self.__dict__['request'] = request


The request object is of type HttpRequest with all the relevant information available. It can also be used to hold non-persistent data for use in a single request.

Because Django will access information from session as a dictionary or a function call, session becomes all-encompassing.


def user(self):
if not self.__user:
self.__user = users.getCurrentUser();
return self.__user


So, a template can access the Google user object - as in {{session.user.nickname}}. I also use the session object for other system information:


def loginURL():
return users.CreateLoginURL('/')
def isAdmin():
return users.IsCurrentUserAdmin


If we were to inherit session from db.Expando we would have to save the whole session any time a piece of data changed. I prefer to only update the data that needs changing by overriding __getattr__ and __setattr__:


def __getattr__(self,name):
if name.startswith('_'):
return None
self.__dict__[name] = value = UserData.Load(name).value
return value

def __setattr__(self,name,value):
if name.startswith('_'):
item = value
else:
try:
def modify(data): data.value = value
item = UserData.Modify(modify,name)
except:
logging.error('Setting session data for '+name)
item = value
self.__dict__[name] = item


UserData saves data to BigTable keyed to a specific user - posted at App Engine Fan: Saving user-specific data

So, instance data starting with underscore is not saved to persistent storage. Nor is data in session.reference. Anything else is persisted as separate data objects. Because UserData is a functional db.Expando, items can be any of these properties. For larger data groups, a reference to another database object or object tree would be suitable.

No comments: