Preread: “Django 1.1 Testing and Debugging”, by Karen M. Tracey

Another Packt Publishing title is on the way for review, `Django 1.1 Testing and Debugging <>`_, by Karen M. Tracey. Django 1.2 ships tomorrow, and I’m looking forward to the book: testing is one of the things that helps software evolve, but it’s also one of the things that’s easiest to ignore on a project. I say this to myself as much as anyone: even though I know tests will make my life better in the long run, when I start hacking, sometimes they’re the furthest thing from my mind. Books about how to test, and how to do it effectively, are definitely a good thing for me.

The PDF preview of chapter 3 (basic unit testing) looks good, and the table of contents looks like a good overview of basic tools and techniques. I’m particularly looking forward to reading about integrating Django with other testing tools (chapter 5), and using Django with pdb (chapter 9). I’m looking forward to learning more about testing my Django applications — and hopefully how I can form better habits around testing.

date:2010-05-16 21:39:04
tags:django, pre-read, python

Back to the Future: Desktop Applications

One of the best prepared talks I saw at PyCon this year was on Phatch, a cross-platform photo processing application written in Python. Stani Michiels and Nadia Alramli gave a well rehearsed, compelling talk discussing the ins and outs of developing their application for Linux, Mac OS X, and Windows. The video is available from the excellent Python MiroCommunity.

The talk reminded of a blog post I saw late last year and never got around to commenting on, Ruby for Desktop Applications? Yes we can. Now I’m only a year late in commenting on it. This post caught my eye for two reasons. First, the software they discuss was commissioned by the AGI Goldratt Institute. I had heard about Goldratt from my father, whose employer, Trusted Manufacturing, was working on implementing constraints-based manufacturing as a way to reduce costs and distinguish themselves from the rest of the market. More interesting, though, was their discussion of how they built the application, and how it seemed to resonate with some of the work I did in my early days at CC.

Atomic wrote three blog posts (at least that I saw), and the one with the most text (as determined by my highly unscientific “page down” method) was all about how they “rolled” the JRuby application: how they laid out the source tree, how they compile Ruby source into Java JARs, and how they distribute a single JAR file with their application and its dependencies. I thought this was interesting because even though it uses a different language (Python instead of Ruby), GUI framework (wx instead of Swing/Batik), and runtime strategy (bundled interpreter instead of bytecode archive), the thing I spent the most time on when I was developing CC Publisher was deployment.

Like Atomic and Phatch, we had a single code base that we wanted to work across the major platforms (Windows, Linux, and Mac OS X in our case). The presentation about Phatch has some great information about making desktop-specific idioms work in Python, so I’ll let them cover that. Packaging and deployment was the biggest challenge, one we never quite got right.

On Windows, we used py2exe to bundle our Python runtime with the source code and dependencies. This worked most of the time, unless we forget to specify a sub-package in our manifest, in which case it blew up in amazing and spectacular ways (not really). Like Atomic, we used NSIS for the Windows installer portion. On Mac OS X, we used py2app to do something similar, and distributed a disk image. On Linux… well, on Linux, we punted. We experimented with a cx-freeze and flirted with autopackage. But nothing ever worked quite right [enough], so we would up shipping tarballs.

The really appealing thing about Atomic’s approach is that by using a single JAR, you get to leverage a much bigger ecosystem of tools: the Java community has either solved, or has well defined idioms for, launching Java applications from JARs. You get launch4j and izpack, which look like great additions to a the desktop developer’s toolbox.

For better or for worse, we [Creative Commons] decided CC Publisher wasn’t the best place to put our energy and time. This was probably the right decision, but it was a fun project to work on. (We do have rebooting CC Publisher listed as a suggested project for Google Summer of Code, if someone else is interested in helping out.) Given the maturity of Java’s desktop tool chain, and the vast improvements in Jython over the past year or two, I can imagine considering an approach very much like Atomic’s were I working on it today. Even though it seems like the majority of people’s attention is on web applications these days, I like seeing examples of interesting desktop applications being built with dynamic languages.

date:2010-03-30 09:04:03
tags:cc, ccpublisher, python

Using pip with buildout

I’ve been asked to add a blog to koucou, and this has turned out to be more of a learning experience than I expected. My first instinct was to use WordPress — I’m familiar with it, like the way it works, and I’m not interested in building my own. The one wrinkle was that we wanted to integrate the blog visually with the rest of the site, which is built on Django. I decided to give Mingus a try. This post isn’t about Mingus — I’ll write about that shortly — but rather about pip, which Mingus uses to manage dependencies. Mingus includes a requirements file with the stable dependencies for the application (one of its goals is application re-use, so there are a lot of them). As I mentioned previously, pip is the Python packaging/installation tool I have the least experience with, so I decided to try converting my existing project to pip as a starting point — to gain experience with pip, and to try and ease integration woes with Mingus.

When I started, the project used the following setup to manage dependencies and the build process:

  • Dependencies which have an egg or setuptools-compatible sdist available are specified in install_requires in

        name = “soursop”,
        # ... details omitted
        install_requires = ['setuptools’,
  • A buildout configuration that uses djangorecipe to install Django, and zc.recipe.egg to install the application egg and its dependencies

    develop = .
    parts = django scripts
    unzip = true
    eggs = soursop
        [django]recipe = djangorecipeversion = 1.1.1settings = settingseggs = ${buildout:eggs}project = soursop
        [scripts]recipe = zc.recipe.eggeggs =
         ${buildout:eggs}interpreter = pythondependent-scripts = trueextra-paths =
       ${django:location}initialization =
       import os
       os.environ['DJANGO_SETTINGS_MODULE’] = '${django:project}.${django:settings}’
  • Dependencies that didn’t easily install using setuptools (either they didn’t have a sane source-tree layout or weren’t available from PyPI) are either specified as git submodules or imported into the repository.

All this worked pretty well (although I’ve never really loved git submodules).

gp.recipe.pip is a buildout recipe which allows you to install a set of Python packages using pip. gp.recipe.pip builds on zc.recipe.egg, so it inherits all the functionality of that recipe (installing dependencies declared in, generating scripts, etc). So in that respect, I could simply replace the recipe line in the scripts part and start using pip requirements to install from source control, create editable checkouts, etc.

Previously, I used the ${buildout:eggs} setting to share a set of packages to install between the django part (which I used to generate a Django management script) and the scripts part (which I used to resolve the dependency list and install scripts defined as entry points). I didn’t spend much time looking into replicating this with gp.recipe.pip; it wasn’t immediately clear to me how to get a working set out of it that’s equivalent to an eggs specification (I’m not even sure it makes sense to expect such a thing).

Ignoring the issue of the management script, I simplified my buildout configuration, removing the django part and using gp.recipe.pip:

[buildout]develop = .parts = soursopunzip = trueeggs = soursopdjango-settings = settingsdjango-project = soursop

    [soursop]recipe = gp.recipe.pipinterpreter = pythoneggs = ${buildout:eggs}sources-directory = vendor

    initialization =
   import os
   os.environ['DJANGO_SETTINGS_MODULE’] = '${buildout:django-project}.${buildout:django-settings}’

This allowed me to start specifying the resources I previously included as git submodules as pip requirements:

recipe = gp.recipe.pip
interpreter = python
install =      -r requirements.txt
eggs = ${buildout:eggs}
sources-directory = vendor

The install parameter specifies a series of pip dependencies that buildout will install when it runs. These can include version control URLs, recursive requirements (in this case, a requirements file, requirements.txt), and editable dependencies. In this case I’ve also specified a directory, vendor, in which editable dependencies will be installed.

That actually works pretty well: I can define my list of dependencies in a text file on its own, and I can move away from git submodules and vendor imports to specifying [D]VCS urls that pip will pull.

Unfortunately, I’m still missing my manage script. I wound up creating a small function and entry point to cause the script to be generated. In soursop/, I created the following function:

def manage():
    “”“Entry point for Django manage command; assumes
    DJANGO_SETTINGS_MODULE has been set in the environment.

    This is a convenience for getting a ./bin/manage console script
    when using buildout.”“”

    from django.core import management
    from django.utils import importlib
    import os

    settings = importlib.import_module(os.environ.get('DJANGO_SETTINGS_MODULE’))


In, I added an entry point:

entry_points = {
       'console_scripts' : [
           'manage = soursop.scripts:manage',

Re-run buildout, and a manage script appears in the bin directory. Note that I’m still using the environment variable, DJANGO_SETTINGS_MODULE, to specify which settings module we’re using. I could specify the settings module directly in my manage script wrapper. I chose not to do this because I wanted to emulate the behavior of djangorecipe, which lets you change the settings module in buildout.cfg (i.e., from development to production settings). This is also the reason I have custom initialization code specified in my buildout configuration.

Generally I really like the way this works. I’ve been able to eliminate the tracked vendor code in my project, as well as the git submodules. I can easily move my pip requirements into a requirements file and specify it with -r in the install line, separating dependency information from build information.

There are a couple things that I’m ambivalent about. Primarily, I now have two different places where I’ve declared some of my dependencies, and a requirements file, and each has advantages (which are correspondingly disadvantages for the other). Specifying the requirements in the pip requirements file gives me more flexibility — I can install from subversion, git, or mercurial without even thinking about it. But if someone installs my package from a source distribution using easy_install or pip, the dependencies won’t necessarily be satisfied [1] [2] . And conversely, specifying the requirements in allows everyone to introspect them at installation time, but sacrifices the flexibility I’ve gained from pip.

I’m not sure that we’ll end up using Mingus for koucou, but I think we’ll stick with gp.recipe.pip. The disadvantage is a small one (at least in this situation), and it’s not really any worse than the previous situation.

[1]I suppose I could provide a bundle for pip that includes the dependencies, but the documentation doesn’t make that seem very appealing.
[2]Inability to install my Django application from an sdist isn’t really a big deal: the re-use story just isn’t good enough (in my opinion) to have it make sense. Generally, however, I like to be able to install a package and pull in the dependencies as well.
date:2010-03-28 13:05:22
tags:dependencies, django, koucou, pip, python, scm, zc.buildout

Pre-read: Grok 1.0 Web Development

|image0|Late last month I received an email from Packt Publishing (en.wp), asking if I’d be interested in reviewing one of their new titles, `Grok 1.0 Web Development <>`_, by Carlos de la Guardia. I immediately said yes, with the caveat that I’m traveling a lot over the next 30 days, so the review will be a little delayed (hence this pre-review). I said “yes” because Grok is one of the Python web frameworks that’s most interesting to me these days. It’s interesting because one of its underlying goals is to take concepts from [STRIKEOUT:Zope 3]Zope Toolkit, and make them more accessible and less daunting. These concepts — the component model, pluggable utilities, and graph-based traversal — are some of the most powerful tools I’ve worked with during my career. And of course, they can also be daunting, even to people with lots of experience; making them more accessible is a good thing.

I’ve read the first four chapters of Grok 1.0 Web Development, and so far there’s a lot to like. It’s the sort of documentation I wish I’d had when I ported the Creative Commons license chooser to Grok1. I’m looking forward to reading the rest, and will post a proper review when I return from Nairobi. In the mean time, check out Grok, Zope 3 for cavemen.

You can download a preview from Grok 1.0 Web Development, `Chapter 5: Forms </media/2010/03/7481-grok-1-0-Web-development-sample-chapter-5-forms.pdf>`_.

1 The CC license chooser has evolved a lot over the years; shortly after Grok was launched we adopted many of its features as a way to streamline the code. Grok’s simplified support for custom traversal, in particular, was worth the effort.

date:2010-03-16 09:14:50
tags:cc, grok, pre-read, python, reading, zope

For Some Definition of “Reusable”

I read “Why I switched to Pylons after using Django for six months” yesterday, and it mirrors something I’ve been thinking about off and on for the past year or so: what is the right level of abstraction for reuse in web applications? I’ve worked on two Django-based projects over the past 12-18 months: CC Network and koucou. Neither is what I’d call “huge”, but in both cases I wanted to re-use existing apps, and in both cases it felt… awkward.

Part of this awkwardness is probably the impedance mismatch of the framework and the toolchain: Django applications are Python packages. The Python tools for packaging and installing (distutils, setuptools, distribute, and pip, I think, although I have the least experience with it) work on “module distributions1: some chunk of code with a This is as much a “social” issue as a technology one: the documentation and tools don’t encourage the “right” kind of behavior, so talk of re-usable applications is often just hand waving or, at best, reinvention2.

In both cases we consciously chose Django for what I consider its killer app: the admin interface. But there have been re-use headaches. [NB: What follows is based on our experience, which is setuptools and buildout based] The first one you encounter is that not every developer of a reusable app has made it available on PyPI. If they’re using Subversion you can still use it with setuptools, but when re-using with git, we have some additional work (a submodule or another buildout recipe). I understand pip just works with the most commons [D]VCS, but haven’t used it myself. Additionally, they aren’t all structured as projects, and those that are don’t always declare their dependencies properly3. And finally there’s the “real” issues of templates, URL integration, etc.

I’m not exactly sure what the answer is, but it’s probably 80% human (as opposed to technology). Part of it is practicing good hygiene: writing your apps with relocatable URLs, using proper URL reversal when generating intra-applications URLs, and making sure your templates are somewhat self-contained. But even that only gets you so far. Right now I have to work if I want to make my app easily consumable by others; work, frankly, sucks.

Reuse is one area where I think Zope 3 (and it’s derived frameworks, Grok and repoze.bfg) have an advantage: if you’re re-using an application that provides a particular type of model, for example, all you need to do is register a view for it to get a customized template. The liberal use of interfaces to determine context also helps smooth over some of the URL issues4. Just as, or more, importantly, they have a strong culture of writing code as small “projects” and using tools like buildout to assemble the final product.

Code reuse matters, and truth in advertising matters just as much or more. If we want to encourage people to write reusable applications, the tools need to support that, and we need to be explicit about what the benefits we expect to reap from reuse are.

1 Of course you never actually see these referred to as module distributions; always projects, packages, eggs, or something else.

2 Note that I’m not saying that Pylons gets the re-use story much better; the author admits choosing Django at least in part because of the perceived “vibrant community of people writing apps” but found himself more productive with Pylons. Perhaps he entered into that with different expectations? I think it’s worth noting that we chose Django for a project, in part, for the same reason, but with different expectations: not that the vibrant community writing apps would generate reusable code, but that they would education developers we could hire when the time came.

3 This is partially due to the current state of Python packaging: setuptools and distribute expect the dependency information to be included in; pip specifies it in a requirements file.

4 At least when dealing with graph-based traversal; it could be true in other circumstances, I just haven’t thought about it enough.

date:2010-03-09 18:38:54
tags:django, python, web, zope

We called it “magic”

Just under ten years ago I started working at Canterbury School doing a variety of things. One thing I wound up doing was building the new Intro to Computer curriculum, based on Python. When Vern and I presented our approach at PyCon in 2003, we were asked what advantages we thought Python had over its predecessor in the curriculum, Java. The first answer was always, “Magic; a lack thereof.” There was less boilerplate, fewer incantations, a much shorter list of things you have to wave your hands about and say, “Don’t worry, we’ll talk about this later in the semester. For right now, it’s magic, just do it.” Magic distracts students, and makes them wonder what you’re hiding.

Seeing a comparison between Java and Clojure (albeit one you can read as more about succinctness than clarity), I was reminded that this lack of magic — boilerplate, ceremony, whatever — is still important.

date:2010-01-06 19:04:13
category:aside, development
tags:magic, python

Caching WSGI Applications to Disk

This morning I pushed the first release of wsgi_cache to the PyPI, laying the groundwork for increasing sanity in our deployment story at CC. wsgi_cache is disk caching middleware for WSGI applications. It’s written with our needs specifically in mind, but it may be useful to others, as well.

The core of Creative Commons’ technical responsibilities are the licenses: the metadata, the deeds, the legalcode, and the chooser. While the license deeds are mostly static and structured in a predictable way, there are some “dynamic” elements; we sometimes add more information to try and clarify the licenses, and volunteers are continuously updating the translations that let us present the deeds in dozens of languages. These are dynamic in a very gross sense: once generated, we can serve the same version of each deed to everyone. But there is an inherent need to generate the deeds dynamically at some point in the pipeline.

Our current toolset includes a script for [re-]generating all or some of the deeds. It does this by [ab]using the Zope test runner machinery to fire up the application and make lots of requests against it, saving the results in the proper directory structure. The result of this is then checked into Subversion for deployment on the web server. This works, but it has a few shortfalls and it’s a pretty blunt instrument. wsgi_cache, along with work Chris Webber is currently doing to make the license engine a better WSGI citizen, aims to streamline this process.

The idea behind wsgi_cache is that you create a disk cache for results, caching only the body of the response. We only cache the body for a simple reason — we want something else, something faster, like Apache or other web server, to serve the request when it’s a cache hit. We’ll use mod_rewrite to send the request to our WSGI application when the requested file doesn’t exist; otherwise it hits the on disk version. And cache “invalidation” becomes as simple as rm (and as fine grained as single resources).

There are some limitation which might make this a poor choice for other applications. Because you’re only caching the response body, it’s impossible to store other header information. This can be a problem if you’re serving up different content types which can’t be inferred from the path (note that we use filenames that look like and, so we tell Apache to override the content type for everything; this works for our particular scenario). Additionally, this approach only makes sense if you have another front end server that can serve up the cached version faster; I doubt that wsgi_cache will win any speed challenges for serving cached versions.

We’re not quite ready to roll it out yet, and I expect we’ll find some things that need to be tweaked, but a test suite with 100% coverage makes that a challenge I’m up for. If you’re interested in taking a look (and adapting it for your own use), you can find the code in Creative Commons’ git repository.

date:2010-01-05 23:37:29
category:cc, development
tags:cache, cc, middleware, python, wsgi, wsgi_cache

Nested Formsets with Django

I’ve published an updated post about nested formsets, along with an generic implementation and demo application on GitHub.

I spent Labor Day weekend in New York City working on a side project with Alex. The project is coming together (albeit slowly, sometimes), and there have been a few interesting technical challenges. Labor Day weekend I was building an interface for editing data on the site. The particular feature I’m working on uses a multi-level data model; an example of this kind of model would be modeling City Blocks, where each Block has one or more Buildings, and each Building has one or more Tenants. Using this as an example, I was building the City Block editor.

Django Formsets manage the complexity of multiple copies of a form in a view. They help you keep track of how many copies you started with, which ones have been changed, and which ones should be deleted. But what if you’re working with this hypothetical data model and want to allow people to edit the Buildings and Tenants for a Block, all on one page? In this case you want each form in the Building formset to have a complete Tenant formset, all its own. The Django Formset documentation is silent on this issue, possibly (probably?) because it’s an edge case and one that almost certainly requires some application-specific thought. I spent the better part of two days working on it — the first pretty much a throw away, the second wildly productive thanks to TDD — and this is what I came up with.

Formsets act as wrappers around Django forms, providing the accounting machinery and convenience methods needed for managing multiple copies of the form. My experience has been that, unlike forms where you have to write your form class (no matter how simple), you write a Formset class infrequently. Instead you use the factory functions which generate a default that’s suitable for most situations. As with regular Forms and Model Forms, Django offers Model Formsets, which simplify the task of creating a formset for a form that handles instances of a model. In addition to model formsets, Django also provides inline formsets, which make it easier to deal with a set of objects that share a common foreign key. So in the example data model, an instance of the inline formset might model all the Buildings on a Block, or all the Tenants in the Building. Even if you’re not interested in nested formsets, the inline formsets can be incredibly useful.

Let’s go ahead and define the models for our example:

class Block(models.Model):
    description = models.CharField(max_length=255)

class Building(models.Model):
    block = models.ForeignKey(Block)
    address = models.CharField(max_length=255)

class Tenant(models.Model):
    building = models.ForeignKey(Building)
    name = models.CharField(max_length=255)
    unit = models.CharField(max_length=255)

After we have our models in place we need to define the forms. The nested form is straight-forward — it’s just a normal inline formset.

from django.forms.models import inlineformset_factory

TenantFormset = inlineformset_factory(models.Building, models.Tenant, extra=1)

Note that inlineformset_factory not only creates the Formset class, but it also create the ModelForm for the model (models.Tenant in this example).

The “host” formset which contains the nested one — BuildingFormset in our example — requires some additional work. There are a few cases that need to be handled:

  1. Validation — When validating an item in the formset, we also need to validate its sub-items (those on its nested formset.
  2. Saving existing data — When saving an item, changes to the items in the nested formset also need to be saved.
  3. Saving new parent objects — If the user adds “parent” data as well as sub-items (so adding a Building, along with Tenants), the nested form won’t have a reference back to the parent unless we add it ourselves.
  4. Finally, the very basic issue of creating the nested formset instance for each parent form.

Before delving into those issues, let’s look at the basic formset declaration.

from django.forms.models import BaseInlineFormSet

class BaseBuildingFormset(BaseInlineFormSet):

BuildingFormset = inlineformset_factory(models.Block, models.Building,
                                formset=BaseBuildingFormset, extra=1)

Here we declare a sub-class of the BaseInlineFormSet and then pass it to the inlineformset_factory as the class we want to base our new formset on.

Let’s start with the most basic piece of functionality: associating the nested formsets with each form. The super class defines an add_fields method which is responsible for adding the fields (and their initial values since this is a model-based Form) to a specific form in the formset. This seemed as good a place as any to add our formset creation code.

class BaseBuildingFormset(BaseInlineFormSet):

    def add_fields(self, form, index):
        # allow the super class to create the fields as usual
        super(BaseBuildingFormset, self).add_fields(form, index)

        # created the nested formset
            instance = self.get_queryset()[index]
            pk_value =
        except IndexError:
            pk_value = hash(form.prefix)

        # store the formset in the .nested property
        form.nested = [
                            instance = instance,
                            prefix = 'TENANTS_%s’ % pk_value)]

The heart of what we’re doing here is in the last statement: creating a form.nested property that contains a list of nested formsets — only one in our example and in the code I implemented; more than one would probably be a UI nightmare. In order to initialize the formset we need two pieces of information: the parent instance and a form prefix. If we’re creating fields for an existing instance we can use the get_queryset method to return the list of objects. If this is a form for a new instance (i.e., the form created by specifying extra=1), we need to specify None as the instance. We include the objects primary key in the form prefix to make sure the formsets are named uniquely; if this is an extra form we hash the parent form’s prefix (which will also be unique). The Django documentation has instructions on using multiple formsets in a single view that are relevant here.

Now that we have the nested formset created we can display it in the template.

def edit_block_buildings(request, block_id):
    """Edit buildings and their tenants on a given block."""

    block = get_object_or_404(models.Block, id=block_id)

    if request.method == 'POST’:
        formset = forms.BuildingFormset(request.POST, instance=block)

        if formset.is_valid():
            rooms = formset.save_all()

            return redirect('block_view’,

        formset = forms.BuildingFormset(instance=block)

    return render_to_response('rentals/edit_buildings.html’,

edit_buildings.html (fragment)

{{ buildings.management_form }}
{% for building in buildings.forms %}

  {{ building }}

  {% if building.nested %}
  {% for formset in building.nested %}
  {{ formset.as_table }}
  {% endfor %}
  {% endif %}

{% endfor %}

When the page is submitted, the idiom is to call formset.is_valid() to validate the forms. We override is_valid on our formset to add validation for the nested formsets as well.

class BaseBuildingFormset(BaseInlineFormSet):

    def is_valid(self):
        result = super(BaseBuildingFormset, self).is_valid()

        for form in self.forms:
            if hasattr(form, 'nested’):
                for n in form.nested:
                    # make sure each nested formset is valid as well
                    result = result and n.is_valid()

        return result

Finally, assuming the form validates, we need to handle saving. As I mentioned earlier, there are two different situations here — saving existing data (and possibly adding new nested data) and saving completely new data.

For new data we need to override save_new and update the parent reference for any nested data after we save (well, instantiate) the parent.

class BaseBuildingFormset(BaseInlineFormSet):

    def save_new(self, form, commit=True):
        """Saves and returns a new model instance for the given form."""

        instance = super(BaseBuildingFormset, self).save_new(form, commit=commit)

        # update the form’s instance reference
        form.instance = instance

        # update the instance reference on nested forms
        for nested in form.nested:
            nested.instance = instance

            # iterate over the cleaned_data of the nested formset and update the foreignkey reference
            for cd in nested.cleaned_data:
                cd[] = instance

        return instance

Finally, we add a save_all method for saving the parent formset and all nested formsets.

from django.forms.formsets import DELETION_FIELD_NAME

class BaseBuildingFormset(BaseInlineFormSet):

    def should_delete(self, form):
        """Convenience method for determining if the form’s object will
        be deleted; cribbed from BaseModelFormSet.save_existing_objects."""

        if self.can_delete:
            raw_delete_value = form._raw_value(DELETION_FIELD_NAME)
            should_delete = form.fields[DELETION_FIELD_NAME].clean(raw_delete_value)
            return should_delete

        return False

    def save_all(self, commit=True):
        """Save all formsets and along with their nested formsets."""

        # Save without committing (so self.saved_forms is populated)
        # — We need self.saved_forms so we can go back and access
        #    the nested formsets
        objects =

        # Save each instance if commit=True
        if commit:
            for o in objects:

        # save many to many fields if needed
        if not commit:

        # save the nested formsets
        for form in set(self.initial_forms + self.saved_forms):
            if self.should_delete(form): continue

            for nested in form.nested:

There are two methods defined here; the first, should_delete, is lifted almost directly from code in django.forms.models.BaseModelFormSet.save_existing_objects. It takes a form object in the formset and returns True if the object for that form is going to be deleted. We use this to short-circuit saving the nested formsets: no point in saving them if we’re going to delete their required ForeignKey.

The save_all method is responsible for saving (updating, creating, deleting) the forms in the formset, as well as all the nested formsets for each form. One thing to note is that regardless of whether we’re committing our save (commit=True), we initially save the forms with commit=False. When you save a model formset with commit=False, Django populates a saved_forms attribute with the list of all the forms saved — new and old. We need this list of saved forms to make sure we are able to save any nested formsets that are attached to newly created forms (ones that did not exist when the initial request was made). After we know saved_forms has been populated we can do another pass to commit if necessary.

There are certainly places this code could be improved, tightened up or generalized (for example, the nested formset prefix calculation and possibly save_all). It’s also entirely plausible that you could wrap much of this into a factory function. But this gets nested editing working and once you wrap your head around what needs to be done, it’s actually fairly straight forward.

date:2009-09-27 19:42:42
category:development, koucou
tags:django, formsets, howto, orm, python

PyCon 2010 CFP: Five Days Left

The CFP for PyCon 2010 closes in five days. I’m on the program committee this year and it’s exciting to see good proposals come in. From the CFP:

Want to showcase your skills as a Python Hacker? Want to have hundreds of people see your talk on the subject of your choice? Have some hot button issue you think the community needs to address, or have some package, code or project you simply love talking about? Want to launch your master plan to take over the world with python?

PyCon is your platform for getting the word out and teaching something new to hundreds of people, face to face.

Previous PyCon conferences have had a broad range of presentations, from reports on academic and commercial projects, tutorials on a broad range of subjects and case studies. All conference speakers are volunteers and come from a myriad of backgrounds. Some are new speakers, some are old speakers. Everyone is welcome so bring your passion and your code! We’re looking to you to help us top the previous years of success PyCon has had.

PyCon 2010 is looking for proposals to fill the formal presentation tracks. The PyCon conference days will be February 19-22, 2010 in Atlanta, Georgia, preceded by the tutorial days (February 17-18), and followed by four days of development sprints (February 22-25).

Online proposal submission is open now! Proposals will be accepted through October 1st, with acceptance notifications coming out on November 15th. For the detailed call for proposals, please see:

For videos of talks from previous years – check out:

We look forward to seeing you in Atlanta!

date:2009-09-25 15:07:50
tags:cfp, conference, pycon, python

Unicode output from Zope 3

The Creative Commons licene engine has gone through several iterations, the most recent being a Zope 3 / Grok application. This has actually been a great implementation for us[1]_, but since the day it was deployed there’s been a warning in `README.txt <>`_:

If you get a UnicodeDecodeError from the cc.engine (you’ll see this if it’srunning in the foreground) when you try to access the http://host:9080/license/then it’s likely that the install of python you are using is set to use ASCIIas it’s default output.  You can change this to UTF-8 by creating the file/usr/lib/python<version>/ and adding these lines:

  import sys

This always struck me as a bit inelegant — having to muck with something outside my application directory. After all, this belief that the application should be self-contained is the reason I use zc.buildout and share Jim’s belief in the evil of the system Python. Like a lot of inelegant things, though, it never rose quite to the level of annoyance needed to motivate me to do it right.

Today I was working on moving the license engine to a different server[2]_ and ran into this problem again. I decided to dig in and see if I could track it down. In fact I did track down the initial problem — I was making a comparison between an encoded Unicode string and without specifying an explicit codec to use for the decode. Unfortunately once I fixed that I found it was turtles all the way down.

Turns out the default Zope 3 page template machinery uses `StringIO <>`_ to collect the output. StringIO uses, uh, strings — strings with the default system encoding. Reading the module documentation, it would appear that mixing String and Unicode input in your StringIO will cause this sort of issue.

Andres suggested marking my templates as UTF-8 XML using something like:

< ?xml version="1.0" encoding="UTF-8" ?>

but even after doing this and fixing the resulting entity errors, there’s still obviously some 8 bit Strings leaking into the output. In conversations on IRC the question was then asked: “is there a reason you don’t want a reasonable system wide encoding if your locale can support it?”

I guess not[3]_.

UPDATE Martijn has a tangentially related post which sheds some light on why Python does/should ship with ascii as the default codec. At least people smarter than me have problems with this sort of thing, too.

[1]Yes, I may be a bit biased — I wrote the Zope3/Grok implementation. Of course, I wrote the previous implementation, too, and I can say without a doubt it was… “sub-optimal”.
[2]We’re doing a lot of shuffling lately to complete a 32 to 64 bit conversion; see the CC Labs blog post for the harrowing details.
[3]So the warning remains.
date:2008-07-19 12:57:33
category:cc, development
tags:cc, development, license engine, python, zope