DocSix: Doctests on Python 2 & 3

I was first introduced to doctests working on Zope 3 at early PyCon sprints. At the time the combination of documentation, specification, and test in a single document seemed pretty interesting to me. These days I like to use them for testing my documentation.

Last week stvs2fork helpfully opened a pull request for Rebar, fixing some syntax that’s no longer valid in Python 3. I decided that it’d be interesting to add Python 3.3 to the automated test runs. Fixing the code to work with Python 3 was easy enough, but when I ran the doctests I discovered an issue I hadn’t thought of:

Unicode string output looks different in Python 3 vs Python 2..

>>> validator = AgeValidator()
>>> validator.errors({'age': 'ten'})
{'age': [u'An integer is required.']}

This example works exactly the same in Python 2 and 3: in both cases the error messages are returned as a list of Unicode strings. But in Python 2 the output has the leading u indicator. Not so in Python 3.

What I needed to do is strip the Unicode indicator from the output strings before executing the test; then I’d have the Python 3 doctest I needed. So I wrote a tool that lets me do that.

DocSix lets you run your doctests on Python 2 and 3.

DocSix builds on Manuel, a library for mixing custom test syntax into doctests. DocSix can work with existing uses of Manuel, or it can load your doctests into a unittest TestSuite, ready to go:

from docsix import get_doctest_suite

test_suite = get_doctest_suite(
    'index.rst',
    'advanced.rst',
)

Potentially useful links:

author:Nathan Yergler
category:development
tags:python, doctests, testing, python3
comments:

Revisiting Nested Formsets

It’s been nearly four years since I first wrote about nested formsets. When I wrote about nested formsets, I must have been using Django 1.1 (based on correlating dates in the release notes and the original blog post), which means what I wrote has had four major releases of Django to drift out of date. And yet it’s still one of the most frequently visited posts on my blog, and one of the few that I receive email questions about. Four years later, it seemed like the time to revisit the original post to see if nested formsets still make sense and if so, what they look like now.

Formsets help manage the complexity of maintaining multiple instances of a Form on a single page. For example, if you’re editing a list of items on a single page, each individual item may be a copy of the same form. Formsets help manage things like HTML ID generation, flagging forms for deletion, and validating the entire set of forms together. When used with Models, they allow you to edit the members of a QuerySet all at once.

So what are nested formsets? The example I used previously was something along the lines of Block – Building – Tenant: one Block has many Buildings, and each Building has many Tenants. If you’re editing a Block, you want to see all the Buildings and all the Tenants at once. That’s a fine hypothetical, but one of the questions I get with some frequency is “what’s a good use case for a nested formset?” Four years later — two and a half of them spent doing web development full time — I have yet to encounter a situation where I needed a nested formset. In that time I’ve built some pretty complex forms, including Eventbrite’s event creation flow. That page was complex enough that I built Form Groups to support the interaction, and I think the jury is still out on whether that was a good idea or not. It’s possible that there are use cases for nested formsets in admin-style applications that I haven’t encountered. I think it’s also possible that there are reasons to use a nested formset alongside a Javascript framework to ease the user experience.

Note that if you only have one level of relationships on the page (ie, you’re editing all the Tenants for a single Building in our example) then you don’t need nested formsets: Django’s inline formsets will work just fine.

And why not nested form sets? From the questions people have asked and my experience building Form Groups (which borrowed some ideas), I’ve concluded that they’re difficult to get completely right, have edge cases that can be hard to manage, and create quite complicated user interfaces. In my original blog post I alluded to the fact that I spent most of a three day weekend trying to get the nested formsets to work right. Two thirds of that time was spent on work I eventually threw away, because I couldn’t manage the edge cases. It was only when I started using TDD that I managed to get something working. But I didn’t publish the tests with my previous code example, so no one else was able to benefit from that work.

If you’ve read this far and still think a nested formset is the best solution for your problem, what would that look like with Django 1.5? The answer is: simpler. I decided to rewrite my initial implementation using test driven development. The full implementation of the formset logic only overrides three methods from BaseInlineFormSet.

from django.forms.models import (
    BaseInlineFormSet,
    inlineformset_factory,
)


class BaseNestedFormset(BaseInlineFormSet):

    def add_fields(self, form, index):

        # allow the super class to create the fields as usual
        super(BaseNestedFormset, self).add_fields(form, index)

        form.nested = self.nested_formset_class(
            instance=form.instance,
            data=form.data if self.is_bound else None,
            prefix='%s-%s' % (
                form.prefix,
                self.nested_formset_class.get_default_prefix(),
            ),
        )

    def is_valid(self):

        result = super(BaseNestedFormset, self).is_valid()

        if self.is_bound:
            # look at any nested formsets, as well
            for form in self.forms:
                result = result and form.nested.is_valid()

        return result

    def save(self, commit=True):

        result = super(BaseNestedFormset, self).save(commit=commit)

        for form in self:
            form.nested.save(commit=commit)

        return result

These three method cover the four areas of functionality I called out in the previous post: validation (is_valid), saving (both existing and new objects are handled here by save), and instantiation (creating the nested formset instances, handled by add_fields).

By making it a general purpose baseclass, I’m also able to write a simple factory function, to make using it more in tune with Django’s built-in model formset.

def nested_formset_factory(parent_model, child_model, grandchild_model):

    parent_child = inlineformset_factory(
        parent_model,
        child_model,
        formset=BaseNestedFormset,
    )

    parent_child.nested_formset_class = inlineformset_factory(
        child_model,
        grandchild_model,
    )

    return parent_child

You can find the source to this general purpose implementation on GitHub. I wrote tests at each step as I worked on this, so it may be interesting to go back and look at individual commits, as well.

So how would you use this in with Django 1.5? With a class-based view, of course.

from django.views.generic.edit import UpdateView

class EditBuildingsView(UpdateView):
    model = models.Block

    def get_template_names(self):

        return ['blocks/building_form.html']

    def get_form_class(self):

        return nested_formset_factory(
            models.Block,
            models.Building,
            models.Tenant,
        )

    def get_success_url(self):

        return reverse('blocks-list')

Of course there’s more needed — templates, for one — but this shows just how easy it is to create the views and leverage a generic abstraction. The real keys here are specifying model = models.Block and the definition of get_form_class. Django’s UpdateView knows how to implement the basic form processing idiom (GET, POST, redirect), so all you need to do is tell it which form to use.

You can find a functional, albeit ugly, demo application in the demo directory of the git repository.

So that’s it: a general purpose, updated implementation of nested formsets. I advise using them sparingly :).

author:Nathan Yergler
category:development
tags:django, formsets, forms, python
comments:

Destroyed 0003

I’m blogging my way through Gary Bernhardt’s excellent Destroy All Software series of screencasts. If you write software, you should probably go buy it today.

In Episode 3, Gary builds a simple version of RSpec, using TDD. I’d seen him do something similar at the Testing in Python BOF at PyCon this year, when he trolled the audience with Ruby, challenging the assertion that “RSpec is hard!” with derision and flair.

The interesting part about the screencast, then, was watching him drive his coding with tests. I’d describe myself as a “testing believer”, but I think I get tripped up at the same place a lot of people do: where do you begin? How do you know what test to write first, when you don’t even know what the call interface is going to look like?

So I found myself exclaiming as he began: “that test doesn’t do shit!” Indeed, the first test doesn’t do anything other that test that there’s this describe thing, that happens to take an argument. So the primary lesson I took away from Episode 3 was that when it comes to TDD, you’d don’t have to know where you’d end up. You just need to start.

The other lesson was that the cycle isn’t “write tests, write code, fix code until tests pass.” It’s more like “write a test, write a little code, repeat”. And there’s an additional step that I don’t always remember: re-read your previous tests, and refactor as needed.

It’s interesting watching these screencasts, and feeling like I’m learning, even though I don’t really know the language (Ruby). In this episode I learned a little more about Ruby’s blocks: the interpreter silently ignores a block passed to a function that doesn’t expect it. I wonder why that is?

I also learned that instance_eval is the core of a lot of Ruby DSLs, and runs a block as if it were applied to an instance (I think I have that right). I think the Python equivalent would be to eval some code with an instance’s dict as the local context.

author:Nathan Yergler
category:destroyed
tags:til
comments:

Hieroglyph 0.6

I just uploaded Hieroglyph 0.6 to PyPI. This release contains a handful of new features, as well as fixes for a few bugs that people encountered. Some highlights:

  • Doug Hellmann contributed support for displaying presenter notes in the console using the note directive.
  • tjadevries contributed a fix for the stylesheet used when printing slides, which should prevent modern browsers from inserting a page break in the middle of a slide.
  • Slide numbering has been reimplemented, and received additional testing.
  • A hieroglyph-quickstart script has been added to make it easier to generate an empty project with hieroglyph enabled.

See the NEWS for the full details.

I’ve also started writing some automated tests for Hieroglyph. These are a little too involved to properly be called “unit tests”, but they’re being run using Travis CI now, which should help avoid regressions as I fix bugs in edge cases.

I spent a few days at OSCON about a week ago, and once again had the pleasure of attending Damian Conway’s “Presentation Aikido”. There are several things he talked about that I could be doing better with my talks. This release of Hieroglyph addresses one of them (quick fade or cut to the next slide, as opposed to the default slide left behavior). I’m working on what other changes I can make to Hieroglyph so that it’s dead simple to just write your slides, and maximize what your attendees take away.

author:Nathan Yergler
category:hieroglyph
tags:rst, hieroglyph, sphinx
comments:

Destroyed 0002

I’m blogging my way through Gary Bernhardt’s excellent Destroy All Software series of screencasts. If you write software, you should probably go buy it today.

Episode 0002 of Destroy All Software talks about nil in Ruby. I’m not a Rubyist. I may be someday, but I’m not today, so I just imagined he was talking about None in Python with weird syntax. This episode is really how returning a nil value can lead to exceptions that are miles away from where they actually originated. Gary demonstrates this with a little Rails app, and I found myself nodding along: I see this with some frequency in the Eventbrite codebase, where a domain model’s property is set to None, and at a later point other code tries to call a method on that value.

You can, of course, write your own property descriptor (in Python) that checks for None and raises an exception when that value is set. At least then the error is localized to when it’s really being set to (or returning) None. But what you really want is to avoid the error altogether. Gary shows a couple ways to potentially do that, including inverting the relationship between domain models, and introducing a new model instead of just setting a property on an existing one.

author:Nathan Yergler
category:destroyed
tags:til
comments:

Destroyed 0001

Some people blog their way through Knuth or SICP. My attention span is somewhat shorter lately. I’ve recently begun watching Gary Bernhardt’s excellent Destroy All Software screencasts, and I thought it’d be fun to blog my way through it with a series of short posts on what I learned from each episode. I’ve watched a few as I start this, and I think that if you write software and care about writing good software, you should probably go buy DAS now.

I watched Episode 1 on my way to OSCON about a week ago. In it Gary works through building a small bash script to calculate some statistics on a git repository (for example, how many lines of code there were at given points in time). The git plumbing bits were pretty interesting, but it was the actual process that was really educational.

One of the first things he also does is map a key to save and then run his script. I almost found myself coveting Vim for a moment, because it seems obvious now that having an immediate feedback loop is actually superior to switching between Emacs and a terminal.

As Gary builds out the script, he points out a few things, like using set -e, and “always quote your arguments”. (It makes a missing argument fallback to an empty string, which programs like grep are perfectly happy with.) That sort of casual, fingertip knowledge is a joy to watch. I guess I haven’t written enough in bash to know better than to check my exit codes manually for things. set -e is obviously better. Way better.

And have you ever considered that bash control structures like while and for have a stdin and stdout? They do. It seemed obvious once I saw him do it, and when I think about the way bash works, it makes sense in a consistency sort of way. But until now I’d never considered piping the output of, say, grep to a control structure.

Watching DAS S1E1 I learned a few things about shell scripting that seem really fundamental, which I wish I’d have known about for, well, years. I also realized that I have this weird mix of git knowledge: I understand that it’s a directed acyclic graph and a bunch of the underlying structures. I also am proficient at using magit to manipulate a repository within Emacs. The git porcelain? Not so much.

Finally, I thought it was interesting to see and listen to Gary refactoring a bash script using some of the same principles that I use when looking at Python code. Specifically, wanting to make code easy to read, not just execute.

author:Nathan Yergler
category:destroyed
tags:til
comments:

Draft Work: “Fast Pass”

Fast Pass, copyright 2013 Nathan Yergler

3” x 5” two plate linocut print

Last week when I was printing my line study, I had a couple extra hours in the studio. I’d previously drawn the plates for a Fast Pass double-plate print, so I quickly carved it as a fun distraction. The Fast Pass was the SFMTA monthly pass card when I moved to San Francisco in 2007. It’s since been supplanted by Clipper, but most of my friends have some emotional attachment to the Fast Pass. You had to get a new one every month, and the colors changed each time, often appearing almost seasonable. I have a collection of the Fast Passes that passed through my hands, and it seems like I’m not alone.

As a draft print that took about 90 minutes to carve, I’m happy with the result. I’m particularly happy with how the MUNI logos came out. This is one of the first prints I’ve done with text on it, so I think the next thing I’d like to work on is cleaning up the text carving a bit.

author:Nathan Yergler
category:printmaking
tags:linocut, multiplate, text, postive-negative
comments: