Thoughts on Deploying and Maintaining SMW Applications

In September or October of last year, I received an email from someone who had come across CC Teamspace and was wondering if there was a demo site available they could use to evaluate it. I told them, “No, but I can probably throw one up for you.” A month later I had to email them and say, “Sorry, but I haven’t found the time to do this, and I don’t see that changing.” This is clearly not the message you want to send to possible adopters of your software — “Sorry, even I can’t install it quickly.” Now part of the issue was my own meta/perfectionism: I wanted to figure out a DVCS driven upgrade and maintenance mechanism at the same time. But even when I faced the fact that I didn’t really need to solve both problems at the same time, I quickly became frustrated by the installation process. The XML file I needed to import seemed to contain extraneous pages, and things seemed to have changed between MediaWiki and/or extension versions since the export was created. I kept staring at cryptic errors, struggling to figure out if I had all the dependencies installed. This is not just a documentation problem.

If we think about the application life cycle, there are a three stages a solution to this problem needs to address:[†]_

  1. Installation
  2. Customization
  3. Upgrade

If an extension is created using PHP, users can do all three (and make life considerably easier if they’re a little VCS savvy). But if we’re dealing with an “application” built using Semantic MediaWiki and other SMW Extensions, it’s possible that there’s no PHP at all. If the application lives purely in the wiki, we’re left with XML export/import[‡]_ as the deployment mechanism. With this we get a frustrating release process, Customization support, and a sub-par Installation experience.

The basic problem is that we currently have two deployment mechanisms: full-fledged PHP extensions, and XML dumps. If you’re not writing PHP, you’re stuck with XML export-import, and that’s just not good enough.

A bit of history: When Steren created the initial release of CC Teamspace, he did so by exporting the pages and hand tweaking the XML. This is not a straight-forward, deterministic process that we want to go through every time a bug fix release is needed.

For users of the application, once the import (Installation) is complete (assuming it goes better than my experience), Customization is fairly straight-forward: you edit the pages. When an Upgrade comes along, though, you’re in something of a fix: how do you re-import the pages, retaining the changes you may have made? Until MediaWiki is backed by a DVCS with great merge handling, this is a question we’ll have to answer.

We brainstormed about these issues at the same time we were thinking about Actions. Our initial thoughts were about making the release and installation process easier: how does a developer[◊]_ indicate these pages in my wiki make up my application, and here’s some metadata about it to make life easier.

We brainstormed a solution with the following features:

  1. An “Application“ namespace: just as Forms, Filters, and Templates have their own namespace, an Application namespace would be used to define groups of pages that work together.
  2. Individual Application Pages, each one defining an Application in terms of Components. In our early thinking, a Component could be a Form, a Template, a Filter, or a Category; in the latter case, only the SMW-related aspects of the Category would be included in the Application (ie, not any pages in the Category, on the assumption that they contain instance-specific data).
  3. Application Metadata, such as the version[♦]_, creator, license, etc.

A nice side effect of using a wiki page to collect this information is that we now have a URL we can refer to for Installation. The idea was that a Special page (ie, Special:Install, or Special:Applications) would allow the user to enter the URL of an Application to install. Magical hand waving would happen, the extension dependencies would be checked, and the necessary pages would be installed.

While we didn’t get too far with fleshing out the Upgrade scenario, I think that a good first step would be to simply show the edit diff if the page has changed since it was Installed, and let the user sort it out. It’s not perfect, but it’d be a start.

I’m not sure if this is exactly the right approach to take for packaging these applications. It does effectively invent a new packaging format, which I’m somewhat wary of. At the same time, I like that it seems to utilize the same technologies in use for building these applications; there’s a certain symmetry that seems reassuring. Maybe there are other, obvious solutions I haven’t thought of. If that’s the case, I hope to find them before I clear enough time from the schedule to start hacking on this idea.

date:2010-01-25 21:24:51
category:cc, development
tags:cc, mediawiki, semantic mediawiki, smw

“Actions” for SMW Applications (Hypothetically)

Talking about AcaWiki has me thinking some more about our experiences over the past couple years with Semantic MediaWiki, particularly about building “applications” with it. I suppose that something like AcaWiki could be considered an application of sorts — I certainly wrote about it as such earlier this week — but in this case I’m talking about applications as reusable, customizable pieces of software that do a little more than just CRUD data.

In 2008 we were using our internal wiki, Teamspace, for a variety of things: employee handbook, job descriptions, staff contact information, and grants. We decided we wanted to do a better job at tracking these grants, specifically the concrete tasks associated with each, things we had committed to do (and which potentially required some reporting). As we iterated on the design of a grant, task, and contact tracking system, we realized that a grant was basically another name for a project, and the Teamspace project tracking system was born.

As we began working with the system, it became obvious we needed to improve the user experience; requiring staff members to look at yet another place for information just wasn’t working. So Steren Giannini, one of our amazing interns, built Semantic Tasks.

Semantic Tasks is a MediaWiki extension, but it’s driven by semantic annotations on task pages. Semantic Tasks’ primary function is sending email reminders. One of the things I really like about Steren’s design is that it works with existing MediaWiki conventions: we annotate Tasks with the assigned (or cc’d) User page, and Semantic Tasks gets the email addresses from the User page.

There were two things we brainstormed but never developed in 2008. I think they’re both still areas of weakness that could be filled to make SMW even more useful as an application platform. The first is something we called Semantic Actions: actions you could take on a page that would change the information stored there.

Consider, for example, marking a task as completed. There are two things you’d like to do to “complete” a task: set the status to complete and record the date it was completed. The thought was that it’d be very convenient to have “close” available as a page action, one which would effect both changes at once without requiring the user to manually edit the page. Our curry-fueled brainstorm was that you could describe these changes using Semantic Mediawiki annotations[1]_. Turtles all the way down, so to speak.

The amount of explaining this idea takes, along with some distance, makes me uncertain that it’s the right approach. I do think that being able to easily write extensions that implement something more than CRUD is important to the story of SMW as a “real” application platform. One thing I that makes me uncertain about this approach is the fear that we are effectively rebuilding Zope 2’s ZClasses, only crappier. ZClasses, for those unfamiliar, were a way to create classes and views through the a web-based interface. A user with administrative rights could author an “application” through the web, getting lots of functionality for “free”. The problem was that once you exhausted ZClasses’ capabilities, you pretty much had to start from scratch when you switched to on disk development. Hence Zope 2’s notorious “Z-shaped learning curve”. I think it’s clear to me now that building actions through the web is going to by necessity expose a limited feature set. The question is whether it’s enough, or if we should encourage people to write [Semantic] Mediawiki extensions that implement the features they need.

Maybe the right approach is simply providing really excellent documentation so that developers can easily retrieve the values the SMW annotations on the pages they care about. You can imagine a skin that exists as a minor patch to Monobook or Vector, which uses a hook to retrieve the installed SMW “actions” for a page and displays them in a consistent manner.

Regardless of the approach taken, if SMW is going to be a platform, there has to be an extensibility story. That story already exists in some form; just look at the extensions already available. Whether the existing story is sufficient is something I’m interested in looking at further.

Next time: Thoughts on Installation and Deployment.

[1]The difference between Semantic Tasks and our hypothetical Semantic Actions is that the latter was concerned solely with making some change to the relevant wiki page.
date:2010-01-07 22:13:41
category:cc, development
tags:cc, mediawiki, semantic mediawiki, smw, teamspace

AcaWiki: On Building Emerging Applications

I’m woefully late in noting the launch of AcaWiki. Mike does a good job exploring the sweet spot AcaWiki may fill between research blogging and open access journals, and where AcaWiki fits into the wiki landscape. AcaWiki is interesting to me for two reasons; first, I was the technical lead on the project, and second, it’s another recent example of building a site using MediaWiki as a platform. More specifically, we used MediaWiki along with Semantic MediaWiki, Semantic Forms, and several other related extensions as the platform for the site.

The idea of using a wiki for a community oriented site is far from new. The difference here is that Neeru came to us talking about specific ways people could interact with the site — specific structured data she wanted to organize and capture about academic articles. For anyone familiar with MediaWiki and Wikipedia, the obvious answer istemplates; Wikipedia uses them extensively to provide a consistent presentation for parts of an articles (messages about the article, citations, etc). The catch is that for someone coming to a site for the first time, who perhaps has not edited a wiki previously, templates are a bit of inside baseball — you need to know which one to use, and you need to know how to format them in your article. Of course these are trainable skills, but I suspect for many users they’re non-obvious. Semantic Forms lets us provide a form for entering these fields, which is then translated to a template.

The question that comes up when discussing this approach with non-wiki-philes is, “why use a wiki at all? if all you need are CRUD forms, why not just whip it up in Rails, Django, etc?” The question is a good one — a specialized tool almost always has the potential to look fantastic compared to an off the shelf one. And who wants to learn that weird markup syntax, anyway? The thing is, at the end of the day, AcaWiki isn’t a software project, it’s a community project. There isn’t a team of engineers available to help move the toolset forward. There isn’t staff available to fix bugs and write migration scripts. So using off the shelf tools with active communities is essential to achieving any amount of scalability.

As Mike points out, there are some niches AcaWiki seems primed to fill. While working on the site, however, it was clear there are lots of unanswered questions about how that will actually happen. AcaWiki, like many sites that seek to serve a community of interest in a given area, is an emerging application. The data schema isn’t well defined, and we don’t necessarily know how users are going to interact with the site. The goal is to get something that users can use in place; something that provides just enough structure to encourage newcomers, while retaining the plasticity and flexibility needed to grow and evolve.

As I mentioned before, this is not the first “application” we’ve built using this tool chain; we use MediaWiki and Semantic MediaWiki at Creative Commons in many places. We use it to track Events our community puts together, and we use it to track things we’d like developers to work on (NB: the latter is woefully out of date and stagnated; perhaps a negative use case for this sort of tool). We even built a system for tracking grants and projects using it.

Using MediaWiki and Semantic MediaWiki as an application platform isn’t appropriate for every project and it isn’t a cure all; there are real limitations, like any off the shelf system. In some cases these issues are magnified due to the fact that it’s not explicitly designed as a platform. For applications that rely on community involvement and that are only partially defined, it usually either gets the job done, or brings us far enough along with minimal effort that we can see what the real problem we’re trying to solve is.

AcaWiki is an exciting experiment in building community around academic research and knowledge. It’s also another in a line of interesting experiments with building applications in a different, organic manner. There’s some interesting work in the pipeline for AcaWiki, including data dumps, a shiny Vector-based skin, and improvements to the forms and templates used. The most interesting work, however, will be the work done by the community.

AcaWiki’s founder, Neeru Paharia, was one of CC’s earliest employees, and she turned to the CC technology team for help with this project.

date:2010-01-04 22:44:17
tags:acawiki, cc, mediawiki, platforms, semantic mediawiki, smw, wiki