DocSix: Doctests on Python 2 & 3

I was first introduced to doctests working on Zope 3 at early PyCon sprints. At the time the combination of documentation, specification, and test in a single document seemed pretty interesting to me. These days I like to use them for testing my documentation.

Last week stvs2fork helpfully opened a pull request for Rebar, fixing some syntax that’s no longer valid in Python 3. I decided that it’d be interesting to add Python 3.3 to the automated test runs. Fixing the code to work with Python 3 was easy enough, but when I ran the doctests I discovered an issue I hadn’t thought of:

Unicode string output looks different in Python 3 vs Python 2..

>>> validator = AgeValidator()
>>> validator.errors({'age': 'ten'})
{'age': [u'An integer is required.']}

This example works exactly the same in Python 2 and 3: in both cases the error messages are returned as a list of Unicode strings. But in Python 2 the output has the leading u indicator. Not so in Python 3.

What I needed to do is strip the Unicode indicator from the output strings before executing the test; then I’d have the Python 3 doctest I needed. So I wrote a tool that lets me do that.

DocSix lets you run your doctests on Python 2 and 3.

DocSix builds on Manuel, a library for mixing custom test syntax into doctests. DocSix can work with existing uses of Manuel, or it can load your doctests into a unittest TestSuite, ready to go:

from docsix import get_doctest_suite

test_suite = get_doctest_suite(
    'index.rst',
    'advanced.rst',
)

Potentially useful links:

author:Nathan Yergler
category:development
tags:python, doctests, testing, python3
comments:

Revisiting Nested Formsets

It’s been nearly four years since I first wrote about nested formsets. When I wrote about nested formsets, I must have been using Django 1.1 (based on correlating dates in the release notes and the original blog post), which means what I wrote has had four major releases of Django to drift out of date. And yet it’s still one of the most frequently visited posts on my blog, and one of the few that I receive email questions about. Four years later, it seemed like the time to revisit the original post to see if nested formsets still make sense and if so, what they look like now.

Formsets help manage the complexity of maintaining multiple instances of a Form on a single page. For example, if you’re editing a list of items on a single page, each individual item may be a copy of the same form. Formsets help manage things like HTML ID generation, flagging forms for deletion, and validating the entire set of forms together. When used with Models, they allow you to edit the members of a QuerySet all at once.

So what are nested formsets? The example I used previously was something along the lines of Block – Building – Tenant: one Block has many Buildings, and each Building has many Tenants. If you’re editing a Block, you want to see all the Buildings and all the Tenants at once. That’s a fine hypothetical, but one of the questions I get with some frequency is “what’s a good use case for a nested formset?” Four years later — two and a half of them spent doing web development full time — I have yet to encounter a situation where I needed a nested formset. In that time I’ve built some pretty complex forms, including Eventbrite’s event creation flow. That page was complex enough that I built Form Groups to support the interaction, and I think the jury is still out on whether that was a good idea or not. It’s possible that there are use cases for nested formsets in admin-style applications that I haven’t encountered. I think it’s also possible that there are reasons to use a nested formset alongside a Javascript framework to ease the user experience.

Note that if you only have one level of relationships on the page (ie, you’re editing all the Tenants for a single Building in our example) then you don’t need nested formsets: Django’s inline formsets will work just fine.

And why not nested form sets? From the questions people have asked and my experience building Form Groups (which borrowed some ideas), I’ve concluded that they’re difficult to get completely right, have edge cases that can be hard to manage, and create quite complicated user interfaces. In my original blog post I alluded to the fact that I spent most of a three day weekend trying to get the nested formsets to work right. Two thirds of that time was spent on work I eventually threw away, because I couldn’t manage the edge cases. It was only when I started using TDD that I managed to get something working. But I didn’t publish the tests with my previous code example, so no one else was able to benefit from that work.

If you’ve read this far and still think a nested formset is the best solution for your problem, what would that look like with Django 1.5? The answer is: simpler. I decided to rewrite my initial implementation using test driven development. The full implementation of the formset logic only overrides three methods from BaseInlineFormSet.

from django.forms.models import (
    BaseInlineFormSet,
    inlineformset_factory,
)


class BaseNestedFormset(BaseInlineFormSet):

    def add_fields(self, form, index):

        # allow the super class to create the fields as usual
        super(BaseNestedFormset, self).add_fields(form, index)

        form.nested = self.nested_formset_class(
            instance=form.instance,
            data=form.data if self.is_bound else None,
            prefix='%s-%s' % (
                form.prefix,
                self.nested_formset_class.get_default_prefix(),
            ),
        )

    def is_valid(self):

        result = super(BaseNestedFormset, self).is_valid()

        if self.is_bound:
            # look at any nested formsets, as well
            for form in self.forms:
                result = result and form.nested.is_valid()

        return result

    def save(self, commit=True):

        result = super(BaseNestedFormset, self).save(commit=commit)

        for form in self:
            form.nested.save(commit=commit)

        return result

These three method cover the four areas of functionality I called out in the previous post: validation (is_valid), saving (both existing and new objects are handled here by save), and instantiation (creating the nested formset instances, handled by add_fields).

By making it a general purpose baseclass, I’m also able to write a simple factory function, to make using it more in tune with Django’s built-in model formset.

def nested_formset_factory(parent_model, child_model, grandchild_model):

    parent_child = inlineformset_factory(
        parent_model,
        child_model,
        formset=BaseNestedFormset,
    )

    parent_child.nested_formset_class = inlineformset_factory(
        child_model,
        grandchild_model,
    )

    return parent_child

You can find the source to this general purpose implementation on GitHub. I wrote tests at each step as I worked on this, so it may be interesting to go back and look at individual commits, as well.

So how would you use this in with Django 1.5? With a class-based view, of course.

from django.views.generic.edit import UpdateView

class EditBuildingsView(UpdateView):
    model = models.Block

    def get_template_names(self):

        return ['blocks/building_form.html']

    def get_form_class(self):

        return nested_formset_factory(
            models.Block,
            models.Building,
            models.Tenant,
        )

    def get_success_url(self):

        return reverse('blocks-list')

Of course there’s more needed — templates, for one — but this shows just how easy it is to create the views and leverage a generic abstraction. The real keys here are specifying model = models.Block and the definition of get_form_class. Django’s UpdateView knows how to implement the basic form processing idiom (GET, POST, redirect), so all you need to do is tell it which form to use.

You can find a functional, albeit ugly, demo application in the demo directory of the git repository.

So that’s it: a general purpose, updated implementation of nested formsets. I advise using them sparingly :).

author:Nathan Yergler
category:development
tags:django, formsets, forms, python
comments:

Destroyed 0003

I’m blogging my way through Gary Bernhardt’s excellent Destroy All Software series of screencasts. If you write software, you should probably go buy it today.

In Episode 3, Gary builds a simple version of RSpec, using TDD. I’d seen him do something similar at the Testing in Python BOF at PyCon this year, when he trolled the audience with Ruby, challenging the assertion that “RSpec is hard!” with derision and flair.

The interesting part about the screencast, then, was watching him drive his coding with tests. I’d describe myself as a “testing believer”, but I think I get tripped up at the same place a lot of people do: where do you begin? How do you know what test to write first, when you don’t even know what the call interface is going to look like?

So I found myself exclaiming as he began: “that test doesn’t do shit!” Indeed, the first test doesn’t do anything other that test that there’s this describe thing, that happens to take an argument. So the primary lesson I took away from Episode 3 was that when it comes to TDD, you’d don’t have to know where you’d end up. You just need to start.

The other lesson was that the cycle isn’t “write tests, write code, fix code until tests pass.” It’s more like “write a test, write a little code, repeat”. And there’s an additional step that I don’t always remember: re-read your previous tests, and refactor as needed.

It’s interesting watching these screencasts, and feeling like I’m learning, even though I don’t really know the language (Ruby). In this episode I learned a little more about Ruby’s blocks: the interpreter silently ignores a block passed to a function that doesn’t expect it. I wonder why that is?

I also learned that instance_eval is the core of a lot of Ruby DSLs, and runs a block as if it were applied to an instance (I think I have that right). I think the Python equivalent would be to eval some code with an instance’s dict as the local context.

author:Nathan Yergler
category:destroyed
tags:til
comments:

Hieroglyph 0.6

I just uploaded Hieroglyph 0.6 to PyPI. This release contains a handful of new features, as well as fixes for a few bugs that people encountered. Some highlights:

  • Doug Hellmann contributed support for displaying presenter notes in the console using the note directive.
  • tjadevries contributed a fix for the stylesheet used when printing slides, which should prevent modern browsers from inserting a page break in the middle of a slide.
  • Slide numbering has been reimplemented, and received additional testing.
  • A hieroglyph-quickstart script has been added to make it easier to generate an empty project with hieroglyph enabled.

See the NEWS for the full details.

I’ve also started writing some automated tests for Hieroglyph. These are a little too involved to properly be called “unit tests”, but they’re being run using Travis CI now, which should help avoid regressions as I fix bugs in edge cases.

I spent a few days at OSCON about a week ago, and once again had the pleasure of attending Damian Conway’s “Presentation Aikido”. There are several things he talked about that I could be doing better with my talks. This release of Hieroglyph addresses one of them (quick fade or cut to the next slide, as opposed to the default slide left behavior). I’m working on what other changes I can make to Hieroglyph so that it’s dead simple to just write your slides, and maximize what your attendees take away.

author:Nathan Yergler
category:hieroglyph
tags:rst, hieroglyph, sphinx
comments:

Destroyed 0002

I’m blogging my way through Gary Bernhardt’s excellent Destroy All Software series of screencasts. If you write software, you should probably go buy it today.

Episode 0002 of Destroy All Software talks about nil in Ruby. I’m not a Rubyist. I may be someday, but I’m not today, so I just imagined he was talking about None in Python with weird syntax. This episode is really how returning a nil value can lead to exceptions that are miles away from where they actually originated. Gary demonstrates this with a little Rails app, and I found myself nodding along: I see this with some frequency in the Eventbrite codebase, where a domain model’s property is set to None, and at a later point other code tries to call a method on that value.

You can, of course, write your own property descriptor (in Python) that checks for None and raises an exception when that value is set. At least then the error is localized to when it’s really being set to (or returning) None. But what you really want is to avoid the error altogether. Gary shows a couple ways to potentially do that, including inverting the relationship between domain models, and introducing a new model instead of just setting a property on an existing one.

author:Nathan Yergler
category:destroyed
tags:til
comments:

Destroyed 0001

Some people blog their way through Knuth or SICP. My attention span is somewhat shorter lately. I’ve recently begun watching Gary Bernhardt’s excellent Destroy All Software screencasts, and I thought it’d be fun to blog my way through it with a series of short posts on what I learned from each episode. I’ve watched a few as I start this, and I think that if you write software and care about writing good software, you should probably go buy DAS now.

I watched Episode 1 on my way to OSCON about a week ago. In it Gary works through building a small bash script to calculate some statistics on a git repository (for example, how many lines of code there were at given points in time). The git plumbing bits were pretty interesting, but it was the actual process that was really educational.

One of the first things he also does is map a key to save and then run his script. I almost found myself coveting Vim for a moment, because it seems obvious now that having an immediate feedback loop is actually superior to switching between Emacs and a terminal.

As Gary builds out the script, he points out a few things, like using set -e, and “always quote your arguments”. (It makes a missing argument fallback to an empty string, which programs like grep are perfectly happy with.) That sort of casual, fingertip knowledge is a joy to watch. I guess I haven’t written enough in bash to know better than to check my exit codes manually for things. set -e is obviously better. Way better.

And have you ever considered that bash control structures like while and for have a stdin and stdout? They do. It seemed obvious once I saw him do it, and when I think about the way bash works, it makes sense in a consistency sort of way. But until now I’d never considered piping the output of, say, grep to a control structure.

Watching DAS S1E1 I learned a few things about shell scripting that seem really fundamental, which I wish I’d have known about for, well, years. I also realized that I have this weird mix of git knowledge: I understand that it’s a directed acyclic graph and a bunch of the underlying structures. I also am proficient at using magit to manipulate a repository within Emacs. The git porcelain? Not so much.

Finally, I thought it was interesting to see and listen to Gary refactoring a bash script using some of the same principles that I use when looking at Python code. Specifically, wanting to make code easy to read, not just execute.

author:Nathan Yergler
category:destroyed
tags:til
comments:

Draft Work: “Fast Pass”

Fast Pass, copyright 2013 Nathan Yergler

3” x 5” two plate linocut print

Last week when I was printing my line study, I had a couple extra hours in the studio. I’d previously drawn the plates for a Fast Pass double-plate print, so I quickly carved it as a fun distraction. The Fast Pass was the SFMTA monthly pass card when I moved to San Francisco in 2007. It’s since been supplanted by Clipper, but most of my friends have some emotional attachment to the Fast Pass. You had to get a new one every month, and the colors changed each time, often appearing almost seasonable. I have a collection of the Fast Passes that passed through my hands, and it seems like I’m not alone.

As a draft print that took about 90 minutes to carve, I’m happy with the result. I’m particularly happy with how the MUNI logos came out. This is one of the first prints I’ve done with text on it, so I think the next thing I’d like to work on is cleaning up the text carving a bit.

author:Nathan Yergler
category:printmaking
tags:linocut, multiplate, text, postive-negative
comments:

Emacs & Jedi

Brandon Rhodes delivered the keynote at PyOhio yesterday[1]. He talked about sine qua nons: features that without which, a language is nothing to him. One of the things he mentioned was Jedi, a framework for extending editors with Python autocompletion, documentation lookup, and source navigation. I say “editors”, vaguely, because Jedi consists of a Python server that the editor communicates with via an editor specific plugin. I’d seen Jedi before, but hadn’t managed to get it working with Emacs. After hearing Brandon speak of it so glowingly, I decided to give it another try. The actual installation was easy: using the master branch of el-get, the recipe installed the Jedi Emacs plugin and its dependencies seamlessly. And it seemed to just work for the standard library.

And once I’d enabled it for python-mode, I was indeed able to autocomplete things from the standard library, and jump to the source of members implemented in Python. But I found that I wasn’t able to navigate to the third party dependencies in my project, and eventually figured out there were three cases I needed to address.

  • Many things I work on use virtualenv. Jedi supports virtual environments if the VIRTUAL_ENV environment variable is set, but I tend to keep Emacs running and switch between several different projects, each with their own environment.
  • Some of my projects also use buildout. When I’m using buildout for a project, the dependencies end up in an eggs sub-directory, which Jedi (as far as I know) doesn’t actually know about.
  • Finally, the setup we use at Eventbrite requires some special handling, as well: we store our source checkouts on an encrypted disk image, which is then mounted into a Vagrant virtual machine, where the actual virtual environment lives. Since the virtual environment isn’t on the same “machine” that we’re editing on, I need to tell Jedi explicitly what directories to find source in.

The Jedi documentation includes an advanced example of customizing the server arguments on per buffer. It assumes static arguments, but it seemed like a solution was possible. I spent a couple hours this afternoon working on my Emacs Lisp skills to make Jedi work in all three of my cases.

kenobi.el

kenobi.el (gist) is the result, and it does a few things:

  1. Fires hooks for each mode after the file or directory local variables have been set up. I found a StackOverflow post that confirmed what I had observed: any file or directory local variables weren’t set when the normal python-mode hook was fired. The mode-specific hooks are chained off of hack-local-variables-hook, which is fired after the local variables have been resolved.
  2. Walks up the directory hierarchy from the buffer file, looking for ./bin/activate at each level. If it finds one, it assumes this is the virtual env, and adds it to the list of virtual envs Jedi will look at.
  3. Walks up the directory hierarchy looking for an ./eggs/*.egg sub-directory at each level. If it finds one, it adds each of those .egg subdirectories to the sys.path Jedi will look at. This allows Jedi to work when you’re editing files in buildout-based projects.
  4. Looks to see if the aditional_paths variable has been set as a list of other paths to add.

The first three bits are sort of implementation details: you can usually just ignore them, and Jedi will just work. The final, though, needs a little explanation.

As I mentioned above, Eventbrite stores the source code on a disk image, which is mounted into a virtual machine where the actual virtualenv lives. That means I need to add specific paths to sys.path when I open a source file in that disk image. To get that to work, I create a .dir-locals.el in the root of the source tree, something like:

(
 (nil . ((additional_paths . (
          "/Volumes/eb_home/work/src"
          "/Volumes/eb_home/work/deps"
          ))))
)

I’m sure that my Emacs Lisp could be improved upon, but it felt pretty good to figure out enough to integration Jedi into the way I use Emacs. I haven’t worked with Jedi extensively, but so far it seems to work pretty well. The autocomplete features seem to be minimally invasive, and the show docstring and jump to definition both work great.

[1]The fact that the video is up less than 36 hours after the keynote is a testament to how great Next Day Video is. They do amazing work at Python (and other) conferences and make it possible to enjoy the hallway track without worrying about missing a presentation.
author:Nathan Yergler
category:emacs
tags:python, virtualenv, buildout
comments:

Effective Django at OSCON Post-Mortem

I’ve been on the road a little more than a week now, back to back conferences. On Tuesday I presented my Effective Django tutorial at OSCON. I’ve recently updated it to cover integrating static assets with your project, and re-organized some of the view information. The biggest challenge with presenting a tutorial like that is figuring out how fast (or slow) to go. I’ve practiced the tutorial and use Django daily, so of course I’m able to type and diagnose what’s going on more quickly. At the break on Tuesday a few people asked me to slow down, but as I walked around the room, two others told me (quite kindly) that they wouldn’t mind if I sped up. This was a little worse than at PyCon, I think, primarily because I didn’t have friends and colleagues assisting me. At PyCon they were able to wander the room and help backfill support for those who were moving a little slower.

I think the next time I deliver Effective Django — or any tutorial — I’ll probably try to mitigate with a few different ways:

  • Make sure I have assistants. I actually had one planned for OSCON, but he unfortunately fell ill. Next time I’ll try to have more than one scheduled (I had four at PyCon, three is probably fine).
  • Provide more guidance in the synopsis. If I’m going to assume no prior Django knowledge, I should probably say something like “We’ll start from zero and build a Django application together,” rather than just relying on the slightly ambiguous Novice tag in the program.
  • Provide some clear stopping points. I noticed that some attendees stayed with it for the first half, or first three quarters, but disengaged before the end. I tried to structure the tutorial so that you can stop at any time and still have learned something, but I could be more explicit about that. “And now we know how to write views that use the ORM,” rather than “We’ll see in a moment how to wire that into the …”
  • Make sure my handouts are ready to go. At OSCON I didn’t realize they were sitting in cardboard boxes off to the side until half way through. I noticed that in the second half, a couple of the people who had been struggling previously were keeping up slightly better once they had another source to refer to.

The people I spoke to afterward uniformly said they learned something (whether it was as much as they’d hoped is another question, I suppose), so this feels like an overall success.

author:Nathan Yergler
category:talks
tags:oscon, pyohio, effective django, python3
comments:

Coffee Cup line study

Untitled (coffee cup line study), copyright 2013 Nathan Yergler

4” x 5” linocut print

I wanted to practice using lines to describe, rather than outline, shapes and surfaces, so I took a picture of a coffee cup on a sunny day and decided to try and make a print from it. I worked small and (relatively) simple to avoid investing too much time in what is (effectively) a practice piece.

It was pretty instructive to carve this in an afternon, and then print it on Wednesday. Because it was small project, I was able to remember what I expected when I was carving, and compare that to what came out. There were a few things that came out as expected, and a few that didn’t. That was sort of the point.

author:Nathan Yergler
category:printmaking
tags:linocut, study, practice
comments: