Friday, February 17, 2012

Rietveld architecture: AppEngine/Django request processing

Just a quick note/reminder of the request handling flow in AppEngine environment for mixed AE/Django application such a Rietveld . Hopefully it provides a good entrypoint to understand how AppEngine works. I use Rietveld as an example, because this project is basically born to show how to run Django on AE.

Rietveld is a Django application that is run by AppEngine. Let's leave all Django stuff aside and learn how AppEngine loads and initializes Python applications first.


Import and execution in Python web apps

In PHP when your code is executed, it is read and interpreted (executed) from start to finish every time a new request arrives. In Python the code is read only once (imported), executed and on subsequent requests only the part that handles request is invoked over and over. The catch is that first request is always different in Python.


What happens when application is uploaded to AppEngine?


Step 1. Standard AppEngine application loading and initialization sequence
  • AppEngine reads app.yaml to understand how to load application (which version of Python it requires and which URLs are handled by which Python scripts)
  • AppEngine initializes application by creating an instance for it
  • Then it looks at URL and executes script that shoud process this URL according to app.yaml
This stuff is actual for every AppEngine application.


Example: --- app.yaml from Rietveld project --->
It is the entrypoint to understand any AppEngine app. If you want to know what is called when you request an URL - first thing to do is to look there.
Step 2. Python code to fine-tune AppEngine params and configure request handler

All requests in Rietveld (except static files) are handled by main.py. It does the following:

  • Imports appengine_config.py that in turn:
    • Initializes and tunes Appstats tool
    • Chooses version of Django to use (1.2 currently)
    • Configures Django to read settings.py with Rietveld specific parameters
  • Adds logger for all exceptions
  • Removes Django's DB rollback event handler (because Rietveld doesn't use DB layer of Django)
  • Creates request handler using Django
  • Passes handler to AE's run_wsgi_app() util to give Django control to process request
Django didn't fire at this point, and nothing magical happened. 

Step 3. Request handling magic

Request handling starts with run_wsgi_app() - this magical function implicitly imports appengine_config.py to read its own settings behind the scenes and then gives control to Django handler created earlier.

Django reads its settings.py mentioned earlier and processes options before executing anything application/request specific:

  • Configures middlewares - that's important, because they provide such things as user object in request:
    • django.middleware.common.CommonMiddleware  - doesn't seem to be used (docs)
    • django.middleware.http.ConditionalGetMiddleware - not sure why it is needed
    • codereview.middleware.AddUserToRequestMiddleware - this one also fetches user-specific parameters from Account record
    • codereview.middleware.PropagateExceptionMiddleware - logs and rewrites exceptions to be more user-friendly
  • Sets urls.py to be the ROOT_URLCONF - mapping between URLs and handler functions in views.py
  • Enables django.core.context_processors.request which adds `request` object to templates
  • Configures template loaders
  • Configures file uploads
  • Configures URL to generate path to static files as `/static/`
  • Rietveld own constants like incoming email address are also defined here
After all above is done, Django handler starts processing the request:
  • It looks into urls.py to find what function should process requested URL
  • urls.py is a redirect to codereview/urls.py with actual mapping, so it reads the latter as well
  • Finds associated function name and calls this function from views.py
And that's basically the entrypoint that you need to start hacking Rietveld/Django and AppEngine.