Thursday 10 July 2008

Appengine, Python and doctests

Up until now I have been working mostly with UI on Google AppEngine. Now I am starting on the real meat of Golf Adept - with the first more than trivial model - the primary stroke record. A model by itself does not necessarily require unit tests, but the loading of the model is special as it comes from the client. A user name is converted to a user object, lattitude and longitued to a GeoPt and date/time to internal format. As it happens the last was the most important.

I love the concept of doctests. A developer leans a new interface best by example. A doctest is a running example where it will do the most good - with the source.

I had already used doctests on business logic files. It is just a matter of executing the doctest library if the module is run as a program. Now I want something entirely different - to run a doctest in the appengine context.

I have chosen to integrate the doctest and web frameworks so a test can be run when needed. My framework is Django because that is what I am using. I use a common library so that the doctest will work in any project I create. If you don't use Django, change the example to webapp and point to it with app.yaml.

To make some sense I need to describe my application layout. My application urls.py is relatively empty, but references one in a common library:


from lib.view import urls

from django.conf.urls.defaults import patterns

urlpatterns = patterns(
'',
)
urlpatterns += urls.urlpatterns


The library one does the work:


from django.conf.urls.defaults import patterns

# from lib.view import urls
# urlpatterns += urls.urlpatterns
urlpatterns = patterns(
'lib.view.page',
(r'^$', 'main'),
(r'^html/(.+)$', 'static'),
(r'^active/(.+)$', 'content'),
(r'^cms/(.*)$', 'cms'),
(r'^admin/(.*)$', 'admin'),
(r'^doctest/(.*)$', 'doctest'),
(r'^(.+)$', 'content'),
)


In short a module called page.py has a method called doctest that is called and passed the rest of the url. So, http://localhost:8080/doctest/model.record will run the sample doctest.

Here is the python to run the specified doctest:


def doctest(request,modulePath):
""" given a module as part of the URL, run a doctest on it.
eg: http://localhost:8080/doctest/model.record
"""
import imp, doctest
# doctest uses imp.get_suffixes - but appengine doesn't allow the use of imp.
# It is only to check module for binary so we can bypass it.
def get_suffixes(): return None
imp.get_suffixes = get_suffixes
# doctest writes to stdout. We need to save that to a string to drop into
# the response.
import sys
stdout = sys.stdout
try:
sys.stdout = StringIO()
module = __import__(modulePath, globals(), locals(), [''])
doctest.testmod(m=module,verbose=False)
content = sys.stdout.getvalue()
if len(content) < stdout =" stdout" content="content," mimetype="'text/plain')">

All the tricks that I sweated to discover are documented above.


  1. Google override imp as it provides a level of access that is risky for common environment. Unfortunately doctest uses imp to check that it is not given a binary file. Since we are proving controlled data we can bypass the test by returning no suffixes.
  2. doctest throws everything to the console. A CGI program sends console output back to the browser. Problem Django expects the contents to be created on demand. So, redirect stdout, grab the output and toss it to the browser.
  3. Lastly, doctest loads and runs. If you run again without changing code is is already loaded. It must keep static data as it tries to combine the results from the current and last run. The solution is to remove the reference in loaded modules so it will reload every time.



Testing a Google AppEngine Model
My example is a real file.

  1. It does not validate data as it is getting said data from a trusted source.
  2. It uses a static load method massage the input data, create a record and save it.
  3. I use a generated key name so that if the same data is loaded more than once it will not be duplicated in the database.
  4. The tests are at on the doc at the head of any method or class.
  5. They call a _test() method that loads a record, checks the database for a result.
  6. The test also deletes the record. Being a good little test it cleans up after itself.
  7. This is a first release with basic tests. When integrating it with other parts of the system it may break. Rather than just fixing the error it makes a lot of sense to replicate the problem in a new doctest line so that any fix can be proved to stay fixed. Besides it is a lot faster to run a doctest over than following a certain manual path through the UI.


# Copyright 2008 Askowl Pty Limited
from google.appengine.ext import db
from google.appengine.api import users
from google.appengine.api.datastore_types import GeoPt
import datetime

class Record(db.Model):
""" Model object encompassing a record taken on the golf course and downloaded
from a mobile phone.

>>> _test("fred@bloggs.com,newHole,Ashgrove,08-07-12 21:15:12,12.34,56.78,90.12,note one")
"{u'lie': None, u'direction': None, u'distance': None, u'club': None, u'type': u'newHole', u'altitude': 90.120000000000005, u'course': u'Ashgrove', u'stroke': None, u'location': datastore_types.GeoPt(12.34, 56.780000000000001), u'time': datetime.datetime(2008, 7, 12, 21, 15, 12), u'quality': None, u'notes': u'note one', u'user': users.User(email='fred@bloggs.com')}"
>>> _test("john@brown.com,stroke,Indooroopilly,08-09-23 09:01:22,43.21,87.65,21.09,note two"
... ",5-iron,full,fairway,clean,straight,136")
"{u'lie': u'fairway', u'direction': u'straight', u'distance': 136.0, u'club': u'5-iron', u'type': u'stroke', u'altitude': 21.09, u'course': u'Indooroopilly', u'stroke': u'full', u'location': datastore_types.GeoPt(43.210000000000001, 87.650000000000006), u'time': datetime.datetime(2008, 9, 23, 9, 1, 22), u'quality': u'clean', u'notes': u'note two', u'user': users.User(email='john@brown.com')}"
"""
user = db.UserProperty()
type = db.CategoryProperty()
course = db.StringProperty()
time = db.DateTimeProperty()
location = db.GeoPtProperty()
altitude = db.FloatProperty()
notes = db.StringProperty()
club = db.StringProperty()
stroke = db.StringProperty()
lie = db.StringProperty()
quality = db.StringProperty()
direction = db.StringProperty()
distance = db.FloatProperty()

@staticmethod
def load(values):
count = len(values)
if count == 0:
return None
if count < key =" 'k;'+values[0]+';'+values[2]" record =" Record(key_name=" user =" users.User(values[0])" type =" values[1]" course =" values[2]" time =" datetime.datetime.strptime(values[3]," location =" GeoPt(float(values[4]),float(values[5]))" altitude =" float(values[6])" notes =" values[7]"> 8:
record.club,record.stroke,record.lie,\
record.quality,record.direction = values[8:13]
record.distance = float(values[13])
record.put()
return key

def _test(line):
values = line.split(',')
key = Record.load(values)
record = Record.get_by_key_name(key)
repr = record._entity.__repr__()
record.delete()
return repr


Immediate Benefit


My first run found a non-trivial problem with writing date/time objects to the database. The first time the test was run the date recorded was adjusted by local time. Subsequent runs within a few seconds would record the date in UTC. Waiting for 30 seconds or so or changing the source would cause the fault again on the first run only. Some research found a Google issue (131) that is marked as fixed in 1.02. I am running 1.1. Fortunately a fix to the datastore file listed here still worked.

Future Improvements


The doctest method could set a HTTP return code if a test fails. This way we can execute the HTTP request from curl and use the return result to control other actions (such as checking in the code).


This package allows a single doctest to be run from the browser. It would not be difficult to integrate this with other examples where you would pass a package name and the code would walk the tree looking for and running doctests in all the modules.


For a larger team continuous integration is valuable. If you have made the changes above it would be simple to hit the URL from the continuous integration server, saving the result and pass/fail from the return code.