New Foundations

I’ve written a little about how I think about technical debt, and what it means to live with it. I want to talk about some technical debt at Creative Commons, and how we handled it the wrong way. A project we thought would take a couple months stretched into years, and in the end never fulfilled the promise we thought it had. And it was supposed to be a straight-forward project.

One of the things people don’t always know about Creative Commons is that there was a large technical component undergirding the licenses. Every license was prepared for three audiences (in talks, this is where I call them disjoint, in a lame attempt at humor): humans (the license “deed”), lawyers (the legal text), and machines. The output for machines was an RDF model of the license: it’s permissions, requirements, and prohibitions. In 2008 we had a technical all hands meeting where the tech team came to the San Francisco office for a week. At that time porting (preparing a license for a new legal jurisdiction and translating the web tools) was in full swing, and as we talked about what the pain points were, launching these new jurisdictions came up as a major source of pain. As we started drawing the model of how things worked on the site, I arrived at the following diagram.

We had at least three different “products” — the license chooser, the API for 3rd parties, and the prepared licenses (deeds and RDF). And for hysterical and historical reasons, they didn’t really use the same information. Well, they did at a certain level: they all used the same translation files, but after that all bets were off. We had the “questions” used for selecting a license modeled as an XSLT transformation (why? don’t remember; wish I knew what we were thinking when we did that), but the transformation needed to have localized content, so we would generate a new XSLT document from a ZPT template (yes, really) when we updated the translations. The license RDF was stored as static files for performance, but there was increasing pressure to provide localized data there, too, which was going to cause a world of hurt. And the chooser had a thin wrapper, cc.license, around the XSLT. Except when it went directly to the XSLT for special cases.

If you look in the upper right hand corner, you’ll see something labeled “cc.licenze”. This was a prototype library I had written when adding support for CC0 to the site. The idea was this: We claim that the RDF is the technical model for our legal tools. If that’s true, can we put enough information in it to drive the entire process, and have a single source of information? After launching CC0, signs pointed to yes. We set out to build a glorious future.

We’d build a single wrapper around our RDF and use it everywhere. We’d update one thing when we launched a new jurisdiction, and all the changes would flow to all parts of the site. It sounded amazing. The thing is, we were talking about moving our core infrastructure — our house — to a new foundation, but that foundation wasn’t built yet. We hadn’t really even figured out if it’d support the house or not.

Undeterred, an engineer set out to start building out “cc.licenze”, filling in the gaps I’d left to make it do all the things that licenses need that CC0 does not. And he got most of the work done, and then he left. So the work languished while we focused on continuing to ship new jurisdictions and do everything an understaffed technical team has to deal with.

The problem isn’t that we wanted to improve our underlying infrastructure, or that we wanted a coherent and consistent model. Those are the right goals. The problem was trying to build an entirely new foundation, with similar but not exactly the same APIs as the original one, and thinking we were going to slip it in. Starting this project today, I’d look at the three ways we were doing things, find the one that had the least debt, and rebase the other services/products onto it. By choosing one currently in use, any improvements made (either by rebasing or fixing bugs) would show immediate benefit. There’s immediate, tangible benefit to going from three ways to do something to two, and from two to one. Once everything uses the same foundation, there’s only one thing to rebuild and replace, not three, and we probably have a better idea about everything it needs to do.

To successfully live with technical debt, this is the sort of maneuver you often have to use. I think of this as Lateral Refactoring: you’re not refactoring to the API/design you want to wind up with, you’re tacking along an orthogonal axis until you’re at the point where you can start moving forward again. By doing this you can realize some benefit sooner, and continue shipping new features and bug fixes.

date:2012-05-16 23:13:57
category:engineering, process, talks

Living With It

So now that I’ve talked about what I think of when I say “technical debt”, I want to dig in on the other half of the title, “Living With It”. What does it mean to live with technical debt? I want to be clear: it does not mean simply accepting or ignoring it. I’m certain that’s the wrong way to build long-lived, robust software. When we encounter technical debt, or something that feels hard, I think there are a few common, understandable, and dangerous reactions. These roughly fall into the categories of “I can do better”, “One more won’t hurt”, and “I can’t go on.”

When some engineers — even good (but not great) ones — encounter technical debt, their reaction may be “I can do better”. That is, “Oh, this is terrible, I can’t possibly work with code like this, I’ll rewrite this part of the system, and then I can get around to what I came here to do.” Rewriting or refactoring debt away may be the right decision, but this statement contains unspoken assumptions that better code is more important than new features or bug fixes for users. This is where the paradox of living with technical debt first shows itself: living with technical debt does not mean accepting it, but it also doesn’t mean fixing it. Right now. The business, the organization, has to make decisions about what’s most important. (Engineers need input into those decisions, and the business needs to respect that input, or the best engineers will go elsewhere, where their input will be respected.) It’s up to the business to decide “can we go dark for n days/weeks/months.” Sometimes the answer may be yes, and we’re free to improve the code with abandon. I think that’s a rare situation. More often the answer is “no”, so we need to live with the debt and develop strategies for improving it (more on that later).

Another reaction that I think is all too common is “I guess one more won’t hurt”. That is, “Well, we’re stupid is these five places, what’s one more?” Living with technical debt does not mean you continue to incur it. If anything, it’s essential to stop running up the tab. This requires rigor and strength of will, not just on the part of the engineer working on the code, but on her peers. The team needs to decide that incurring additional debt is not acceptable: you can maintain or you can improve, but you can’t backslide. The danger of “one more won’t hurt” is that the problem spreads: you build new features that repeat past mistakes, instead of providing a model for future work.

Finally, sometimes we look at code and think, “I can’t go on”. I find that those are the time it’s helpful to step away from a project, take a break, come back after a good night’s sleep. You don’t always have that luxury, but feelings of despair rarely coincide with my best work. I’ve observed that indulging in the first two ways of thinking — “I can do better” and “One more won’t hurt” — often leads to the final one — “I can’t go on”. “One more won’t hurt” just digs a deeper and deeper hole, until you can’t see your way out, and “I can do better” often leaves you with a piece of “perfect” code that doesn’t quite fit into the rest of the system, leaving you to shims and scotch tape, the very things you started out trying to avoid.

In “Good to Great”, Jim Collins writes about characteristics that separate good companies from great ones. One of the principles he identifies is “Confront the brutal facts, but never lose faith.” In other words, it does no good to pretend that your code (company in his case) is something that it isn’t. Collins talks about meeting Admiral Stockdale, and asking him, “Who didn’t make it out?” “Oh, that’s easy — the optimists.” Stockdale explains that the optimists were routinely disappointed, and eventually lost faith. “I can’t go on.” Collins quotes Stockdale as saying, “You must never confuse faith that you will prevail in the end — which you can never afford to lose — with the discipline to confront the most brutal facts of your current reality, whatever they might be.” Technical debt may be a far cry from Stockdale’s situation, but the principle holds as the heart of truly living with technical debt: we must confront things as they are, not as we wish they were. And we must believe that we can make things better, that we know where we’re going.

date:2012-05-15 21:32:00
category:engineering, process, talks

Living With Technical Debt, Part One

I’m speaking at Velocity next month on “Living with Technical Debt”. Like any mature codebase, our software at Eventbrite has technical debt. Like any project with rapidly shifting priorities, the code we built at Creative Commons had technical debt. It’s only in the last year or so that I’ve really come to see that and start to think about how one navigates technical debt. So there are a lot of ideas floating around in my head about what I want to talk about. This post (probably the first of several) is me trying to get those ideas out of my head and into text, so I can go about organizing my talk. Not everything in here is going to make it into the final talk, and I expect that whatever does will be re-organized and re-synthesized.

I don’t think it’s unreasonable to start with what I mean by “technical debt”. “Technical Debt” is a euphemism, usually trotted out when we’re talking about something we don’t like about software or systems. I say “don’t like” as if the label is undeserved: it’s not always clear when someone says “technical debt” if they’re talking about code that’s obviously difficult to work with, or just makes a different set of choices than they would have made. One definition I’ve been thinking about is this: technical debt is some aspect of your system that increases the cognitive overhead of understanding, improving, and maintaining it. It’s possible there should be a clause added about “for the majority of developers”, too: I know there’s code I’ve written that absolutely minimizes cognitive overhead for me, but the things I’m used to, idiomatic Nathan, makes it harder for someone else to come and fix a bug or add functionality.

By speaking about technical debt in terms of cognitive overhead, we can start to detach ourselves from the situation emotionally. It’s pretty easy to become emotionally involved with the code we write. And usually that’s a good thing: it’s important for me to work on things that feel important, things that I feel like I can leave my mark on. I’d like to posit, however, that it’s possible to become emotionally co-dependent with your code. That may sound like a strange idea, so let me explain: whenever something I create becomes a proxy for my self — my individuality, my self worth — it is nearly impossible for me to see problems with it. It is nearly impossible for me to hear anything but glowing praise. And when I do hear glowing praise, it’s never enough. I’ve observed two different effects of these feelings. First, I start treating situations like a zero sum game: it’s not enough to succeed, others must fail. It’s not hard to see how this would lead to hypersensitivity and hypercriticality at the same time. Second, I don’t make smart decisions: I make them based on my feelings rather than on reality. I don’t know why this would be any less true of code than it is of other endeavors. So to really see technical debt in our systems, we need to detach ourselves emotionally: it’s not about who’s at fault, it’s about how we make it better.

(There’s a whole other topic around team building here; I’m going to assume for the purposes of this discussion that you have the people you want on your team, either because they’re operating at the level you need them to, or because you believe they can grow to that level.)

So what are some ways your system can add to the cognitive overhead needed to understand it? I can think of a few: inconsistency, duplication, and lack of cohesion all immediately come to mind. These all make it difficult for an engineer to understand, maintain, and improve a system. More on that later.

date:2012-05-14 21:29:05
category:engineering, process, talks

Debugging my Creative Process

I’ve been taking print making classes this year, and have really enjoyed exploring something new. What’s been particularly interesting for me is seeing parallels between what I think of as a creative hobby – print making – and what I think of as creative work – writing software.

I showed my work publicly for the first time two weeks ago. The day after the show I had booked time in the studio. I showed up after work that day with my tools, anxious to get back to printing. It had been a couple weeks since I’d been in the studio, and last time I was there had been very productive: I’d spent the entire day working with the same image, producing six unique prints as I tried to add more texture and depth to the precise lines of the stencils I’d been creating. The result was a set of prints which were somewhat uneven in quality, but which showed a progression of control and vision. With each one I tried something a little different, until I felt like I had a good understanding of what I really wanted. Going into the studio that evening, I had about three hours of printing time, and hoped to bring that same exploration to another image.

I did wind up with five prints that evening, but none of them resonated for me like the Golden Gate Bridge prints did. As I pulled each print, I’d look at it, realize it wasn’t what I’d had in mind, and try to think about what to do next. Time in the studio usually passes quickly, and I feel like I’m racing the clock to do everthing that comes to mind. But that evening felt disjointed and choppy, and when it came time to clean up, I was ready to go home. I’d tried serious, whimsical, and abstract, and none of them felt like they worked for me that night. As I rode home from the studio, I felt disappointment. The experience wasn’t the effortless expression of creativity I was used to, and the work I had produced didn’t speak to me as I hoped and had come to expect.

The next morning I looked over the pieces again, and I realized that in each case there was one or two things that I didn’t like, which overwhelmed the rest of the piece. In one case I made a choice about negative space that turned out to be the wrong one. In another I tried to do too much at once, and my vision hadn’t translated well onto paper. As I stood there looking at each piece, I thought to myself, “Why didn’t you just do this exact same image again, but change the aspect you didn’t like?” Somehow I’d forgotten that it was OK to repeat yourself, to try again if the result wasn’t what I was looking for. I’d fallen into the trap that creativity is all about the flash, the spark, and that it just magically happens.

If I think about writing software, I’m well aware that getting the result I want is real work: we have test suites, debuggers, and continuous integration tools for a reason. We often don’t get it right on the first try. Just because the “test suite” for print making is personal and subjective doesn’t make iteration any less important.

I had my first linocut class Wednesday evening. Linocut involves carving a linoleum plate with an image, which you can then use to make a print. Our instructor asked us to bring a simple image to use for our first plate, and to get some experience with carving. I spent some time searching for the “perfect” image to use, something that I would be new and different and push the boundaries of my print making. In the end I wound up taking one of my monotype stencils and generated a scaled down version of it. And I couldn’t be happier with how it turned out.


Yes, it’s the same cat that I’ve been working with for the past couple months. But that doesn’t mean I’m not expanding my skill set, trying different techniques, and iterating. I have plenty of time to try new images out, and if I spend the time now, debugging my technique and learning how to iterate (just like I do with software), I think my ability to tackle more complex and involved work will grow, just like it has with software.

date:2011-04-24 10:22:00
category:printing, process
tags:iteration, linocut, meta