ccValidator Refactoring

It’s been pointed out that ccValidator only supports RDF embedded in an HTML comment, and not any of the other officially sanctioned ways. Most asked for is <LINK> support, which seems to be used quite a bit. I started to refactor the validator today to support LINKed RDF. I’ve talked about it a lot, but this seemed to be the time to also work on cleaning up the code. Right now it’s something of a mess, and really difficult to completely understand. Andrew Kuchling presented on the Quixote form framework at PyCon, which allows you to write a single class which defines, validates and processes an HTML form. That combined with it’s straightforward templating made it an obvious choice.

So I’m currently working on refactoring the validator to use Quixote. The goals that I’m working toward include a cleaner code layout, semi-transparent (or at least more Pythonic) HTML escaping (which Quixote, with the possible addition of Nevow will provide), and support for multiple methods of RDF extraction.

The last item is where much of the work is being invested. The current RDF extraction technique (which uses simple regexes) was borrowed by the Creative Commons Search engine. This simple borrowing demonstrates that there’s a need for a straightforward way to extract RDF from documents, regardless of application. In order to facilitate that in ccValidator and the search engine, I’m working on a pluggable text extraction class, rdfExtract (working title). Hopefully I’ll have a new beta of the validator up later this week which will be easier to maintain and extend.

date:2004-03-30 16:00:53
wordpress_id:108
layout:post
slug:ccvalidator-refactoring
comments:
category:ccValidator

PyCon 2004 Post Mortem

We made it back to Fort Wayne last night around 9:30PM. If anything sticks out from our weeklong it’s the relative pleasantness of our air travel. Nearly every trip over the past year has been marred by delays, unsympathetic agents and bitches manning the ticketing counters. This trip, my first on Continental, came and went without a hitch and with no significant delays. Impressive.

So a brief wrap-up of the week I spent in DC. Was it worth it? Without a doubt. My TODO.txt file expanded exponentially through the week, with a combination of projects to investigate and possible software ideas.

It felt like the schedule this year was more compact, with fewer simultaneous sessions. Whatever the reason for this, it was a good thing. While not every session inspired me equally, they were much more consistent than last years. Additionally, there were more sessions I was actually excited about than last year. So I’d say the tighter schedule was a definite improvment.

As important as the actual sessions was the ability to discuss and exchange ideas with peers I hold in the highest esteem. That social interaction is by far the most valuable aspect of PyCon and what will bring me back next year.

So here it is; the highlights of the week (in no particular order):

* the Chandler BOF with Mitch Kapor * sprinting on Zope 3 with Stephan Richter and Jim Fulton * IronPython: finally, a reason to screw around with Mono? * Starkiller: “f’in A” * Bruce Eckel’s keynote * presenting my work and getting positive feedback

date:2004-03-27 14:11:13
wordpress_id:107
layout:post
slug:pycon-2004-post-mortem
comments:
category:geek

PyCon Day 3 Wrap-Up

Bruce Eckel’s keynote this morning was excellent. Suprisingly enough, I would almost say I found it the most interesting keynote of the conference. It raised many issues that I think get glossed over sometimes. His presentation did this by acknowledging that Python might not be right for everything, and that sometimes good ideas come from other languages (and sometimes examples of bad ideas do, too).

After the keynote I attended the Web Programming track. The only session I want to highlight is Steve Holden’s Setting a Context for the Web User, which covered a technique he’s developed for making sure that the user of a web application has the proper context for their actions.

Since the word context is often used to mean different things, a definition is useful. Steve defines context as visual and,er, contextual clues that tell the user what will happen when they click a link or a button. The problem with web applications is that users often resort to the Back button when they can’t figure out what something will do. Even worse, the stateless nature of HTTP means that users will edit something, but not post the form, so the changes just go away. This makes for angry users. Steve’s technique uses some client-side JavaScript, along with a call stack stashed into the session variable.

This allows users to traverse a stack of links and have their form changes saved each step along the way. As the user submits each form (through an Apply or Cancel button, for example), they are presented with the form they saw previously. This is an idea that seems so simple in description, but so powerful in execution. There are plenty of things in Stoa where users accidently (or intentionaly) click a link and lose their form changes because they don’t understand the semantics of the web application.

After lunch, David Ascher’s oddly titled session, “Flour and water make Bread” was actually a very interesting description about what goes into mixing business and Open Source software. One of his points that struck a chord with me was that remembering your audience is important. If you come up with a vaguely obscene-sounding application name because you think it’s funny, it’s not going to fly with the business community. I know this is true because DataSiphon was initially named PyMP (Python Myql imPorter). Gee, wasn’t I clever. I tried to use it at client sites, and found myself actually blushing. Hence the name change to DataSiphon. Overall David was obviously qualified to speak on Open Source and business, and I found his insights right on and very informative.

After David’s presentation on Open Source and business, Bob Ippolito spoke on MacPython. I’ve been struggling with how Mac OS X fits into my socia-software-political belief structure, a struggle which is made all the more interesting since two of my four machines are Mac OS X based. For some reason I envisioned Bob as an old guy with straggly white hair, talking about the bad old days of punch cards. Not the case at all.

Bob’s presentation was a good overview of the state of Python on Mac OS X. I won’t go into details that are probably better described at the PythonMac website. At this point I’m off to the airport. I’ll post some wrap up thoughts this weekend.

date:2004-03-26 14:30:20
wordpress_id:105
layout:post
slug:pycon-day-3-wrap-up
comments:
category:geek

A Funny Thing Happened On The Way to the Conference

This morning I was walking from the hotel to the conference when the weirdest thing happened to me: I was cruised. Big time. This hot muscle bear with a five o’clock shadow at 7 AM caught my eye from his Jeep Wrangler while he was stopped at a red light. I held his gaze until I had walked past and when he finally drove by he gave me a little salute. Being objectified has never been so satisfying. You may ask how I know it was cruising, to which I can only respond: oh, I know.

This is the sort of thing that never happens to me, only to my friends. I could credit the fact I was wearing my Guess jeans, which were purchased with the hope of encouraging this sort of attention. But I think I’ll choose to believe that it was really just me. I like this town.

date:2004-03-26 09:31:40
wordpress_id:104
layout:post
slug:a-funny-thing-happened-on-the-way-to-the-conference
comments:
category:my life

PyCon Day 3

This morning is the start of PyCon day 3. Bruce Eckel, of Thinking in [some language] fame (if fame is the right word), is giving the keynote. Bruce and I sprinted together last year at the Zope 3 sprint, and he’s a very smart man. His presentation is titled How to Argue About Types, which since he’s written about C++, Java and Python, he probably understands.

He’s starting out with a small report from the Software Development conference, which he attended last week. Apparently Java threads have always been broken. Who knew. I knew that Java threads changed with each JDK, but never knew why. Now I do: they just can’t get them “right”.

The D Language was another area of interest at the SDC, and Bruce claims that it often runs faster than C and is still compiled to native code. It includes built in unit testing and garbage collection as well. Definitly something to look into.

Now on with the show. Today’s sessions are a little less focused on my core areas of interest, but that’s OK. There’s a session at the end of the day I’m looking forward to on MacPython. I’m interested in finding out what’s different about it and how I can use it.

After the sessions, we’re heading right to the airport and with any amount of luck we’ll be back in Fort Wayne around 9:30 this evening. We’re leaving the hotel at 3:15, so we’ll hopefully be at the airport at least 2 hours before our flight leaves, and hopefully that’ll be enough time. I have my doubts, but then that’s to be expected.

date:2004-03-26 09:10:46
wordpress_id:103
layout:post
slug:pycon-day-3
comments:
category:geek

PyCon Day 2

Today is the second day of PyCon, and while I have enjoyed the talks I’ve attended, I’ll admit that a)I was a little preoccupied with my own presentation and b)I’m tired. The day started with a keynote by GvR on the future of Python. Nothing earth shattering, but interesting none the less. If anything, I was a little suprised the hear that while 2.4 is coming this summer (probably), 3.0 is a ways off. I guess I shouldn’t be suprised: the >= 2.3 series is incredibly powerful and featurful. I just thought 3.0 would have a more definite time table.

The morning was spent in the Zope track. Jim Fulton presented the Zope Development Roadmap, which wasn’t really anything new to me. Joel Burton’s presentation on PostgreSQL in Python and Zope was interesting in a peripheral sort of way. I don’t currentlyhave any plans to use PostgreSQL over MySQL, but he did have some “best practice” suggestions that Vern and I agreed would be good to roll into Stoa. Our presentation on the development of Stoa rounded out the morning.

As I predicted, we had no problem filling the half hour time slot, which was good: I knew I was going to talk fast because I was nervous, so people just assumed I was talking fast to stay within the time limit. A few interesting notes: I probably pissed off the Plone people when I declared that I couldn’t ever figure out how it would make my life better. I stand by that, but I had my ear bent after the presentation by enough people that I want to check it out again. Even though there were plenty of Plone supporters, most of them agreed: the state of Plone when we looked at it before probably did turn us off. I also had a couple of people approach me wanting to know where they could get the code. It will be available; soon. In general the reaction seemed positive, and I think we built some definite interest in Stoa as a project.

After lunch I attended Travis Hartwell’s session on Python and GTK. I was impressed with how little code is really necessary to build GTK interfaces. Of course, unless the non-X11 MacOS X port starts moving (and fast), I doubt I’ll use it much. wxPython, for all it’s warts and problems, stills does the best cross-platform GUIs I’ve seen. The rest of the time before the break was spent in lightning talks. These were generally interesting, especially Graham Fawcett’s presentation about Victor, a course management system he’s working on at the University of Windsor. He’s using Quixote, and it looks like the light-weigh approach has allowed him to build an impressive project quickly.

date:2004-03-25 17:09:54
wordpress_id:102
layout:post
slug:pycon-day-2
comments:
category:geek

PyCon BOFs

Last night I engaged in a short of BOF marathon. For someone who was previously a BOF virgin, it was an interesting experience. BOFs, or Birds Of a Feather meetings, are opportunities for people with similar interests to meet and converse about a given subject in a casual environment. I attended three BOFs last night: distutils, Quixote and Chandler.

The distutils BOF consisted of a spirited discussion about how broken or wrong distutils currently is. Distutils, which in theory provides a way to sanely distribute Python modules, in practice makes it incredibly difficult to do things that don’t match the distutils mindset exactly. Fred Drake,et al, have organized a distutils Open Space session tomorrow so I’m hopeful that a strategy or plan for improving distutils (distutils++) will emerge before the end of the conference.

The Quixote BOF was by far the smallest and most low key. I haven’t played with Quixote for a while, but after seeing Andrew’s presentation of Quixote, I think I need to look at it again. The main focus of the Quixote BOF was advocacy: how do users of Quixote encourage and promote it’s use to others?

Finally, the Chandler BOF was the largest and most interesting. Mitch Kapor, who provided a rather dry key-note, showed that his forte is definitly speaking to slightly smaller groups. He also demonstrated that he’s very passionate about Chandler and what the OSAF is trying to do. Starting at 9PM, Kapor et al held forth to a fairly full room about Chandler: what’s going on, what’s going wrong, and what’s coming up. Many community members expressed concern about two very different topics: internationalization and transparency. Barry Warsaw led the charge in encouraging Mitch and Ted to look into internationalization and localization now, instead of later. Others (I don’t know their names) questioned the frequency of releases and the openess of the discussions. I’m not sure how valid this is: when I’ve looked, the wiki seems fairly all-consuming. I left at 10:10, but Vern says it went until nearly 11:00PM.

Tonite I’m heading to the Python in Education BOF, which should be interesting.

date:2004-03-25 08:49:24
wordpress_id:101
layout:post
slug:pycon-bofs
comments:
category:geek

PyCon Day 1

Three-quarters of the way through the first day of the PyCon sessions, it’s already been an exciting and enlightening experience. Mitch Kapor, most recently of the OSAF, presented the keynote to start the day. I found it particularly interesting that he acknowledged up front that he stands in awe of developers and coders. Mitch spoke of attending a summer program for budding astronomers in 1966. A computer, a Bendix G10, was available, and while he was able to make it perform basic computations and tasks, there were other students there who would stay up all night hacking it to calculate comet tragetories and orbits. Here’s what he said:

“What I learned was that while I was fascinated by computers, there were people who had something I didn’t have: some kind of skills, patience, and who by writing code could get computers to do amazing things. And I realized I wasn’t that sort of person, but I admired them.”

The sessions I attended this morning were book-ended by two presentations of web development: mod_python and Quixote. I attended the mod_python session to gain background on exactly what it entails and how it could possibly be used. I took two things away from the presentation. First, mod_python seems better suited for customizing Apache than developing web applications. It includes PSP (ostensibly Python Server Pages), but it’s real power seems to come from the hooks it exposes for the Apache request serving process. I can much easier imagine “fixing” mod_rewrite than developing an entire application using mod_python. The second point I took away (or maybe was just reminded of) is that Tufte is right.

The last session of the morning was an overview of Quixote. I’ve experimented with Quixote in the past, but never worked with it enough to develop a full fledged application. At the session Quixote’s form framework was demonstrated, which impressed me with it’s simplicity and clarity. While not exactly the same, it definitly shares some philosophy with Zope 3’s schema-based form generation. It was interesting that both mod_python and Quixote developers acknowledged inspiration from Zope, and stated that the need for something lighter weight drove their development. I think the advent of Zope 3 will help remove that perceived weight: Z3 will still have the powerful mechanisms of Zope 2, but they’ll be exposed in a much more Pythonic way. This of course raises the question about where the line between the Zope 3 and Quixote problem domains lies. I suspect it has something to do with the need for content management facilities.

In between the two web sessions this morning I attended Andrew Koenig’s presentation on how Python helped him design his new kitchen. This was by far the most entertaining and most inspiring presentation of the morning. Koenig stressed the importance of using tools you have, and more importantly using tools you understand in order to solve problems. In generating the manufacturing outline for his kitchen countertop, he used pic and troff, because he knew how they worked. And he used Python to generate the input for pic, because it was possible to quickly go from thought to image, with few steps in between. The question asked of Koenig were also interesting. They demonstrated (to me, at least) that people like small (simple?) pieces of software that does a particular task and does it well. This is probably worth remembering as we developers are tempted to add functionality because “it’s really cool”.

This afternoon I started out intending to write up my experience of the morning’s presentations. I sat down in the IronPython presentation intending to tune out the speaker and found myself transfixed. IronPython is an implementation of Python on the .NET CLR. More interesting to me is the intention that IronPython should allow Python code to run on Mono when it reaches 1.0. THe use of the ECMA standard CLR means it should be possible to mix Python code with code generated in other CLR languages. This seems really exciting, since each language seems to have it’s own special niche.

The author, Jim Hugunin, started out intending to write a paper on why the CLR is so hostile to dynamic languages. He ended up determining that many basic Python constructs would actually run faster on the CLR than CPython. While IronPython is still far from ready for production use, it was amazing to watch Jim demonstrate using the Python console to import Microsoft’s Agent framework and instantiate the Merlin animated agent. While this was the only interesting use of Merlin I’ve seen, I am glad he didn’t choose Clippy.

Following Jim was the presentation of a static-type inference engine, Starkiller. Michael Salib, a graduate student at MIT, demonstrated the algorithms and system he’s using to speed up Python using type determination and inference. Michael is memorable as the most irreverent speaker yet. Some highlights:

* “because I’m from MIT I’m going to make bland assertions without backing them up with numbers” * “I’m RMS, I wrote emacs. Will you sleep with me?”

After the performance focus of the IronPython and Starkiller talks, I wandered into to hear about DEAP, a package for generic scientific data graphing analysis developed for astronomers. Nothing I’ll ever use, but I was amazed at the breadth of applications being developed in Python. Ditto for the following talk on using SimPy to model nuclear fuel manufacturing processes.

You can find the full text of the PyCon presentations online at the PyCon website.

date:2004-03-24 15:57:53
wordpress_id:100
layout:post
slug:pycon-day-1
comments:
category:geek

Zope 3 Sprint, Day 3

Today was the final day of the Zope 3 sprint, and overall I’d call it a success. We started the day finishing up the implementation of type-based subscriptions we started yesterday. It’s amazing what a little distance can do: we came back, merged the work our two teams did yesterday and were able to write tests and commit the remainder of the task before lunch.

After lunch we began work on the second half of the proposal, instance event subscriptions. While we made quite a bit of progress, a few details in how to register the events kept us from merging the branch back to the HEAD of Zope 3 before the end of the day. Jim spent a bit of his time explaining the motivation behind the proposal and what he envisioned as the implementation details. Jim is an excellent teacher, but I’ll admit my head was swimming a bit when we returned from lunch and began work. As Jim guided me through the implementation of a Zope3 Adapter, it was as if the clouds parted and the geeky sun shone down on me. All I could think was “this is too damn easy to actually work.” As Mark pointed out to me yesterday, that feeling is the sign of a well designed framework.

What was most exciting for me today was hearing about what’s coming yet in the Zope 3 event architecture. Currently events can be published or registered, and “listened” for. So you can receive notification that Object A has been deleted, but you can’t do anything to stop it. A proposed improvment to this is the implementation of “TentativeSubscribers”, objects which want to know about “tentative” events. Objects subscribing to tentative events can decide if they have issues with the event occuring and either veto it or return some issue(s) for the software to resolve before going ahead with the event. While still completely in the talking phase, this sort of framework would allow for powerful and rich semantics to be developed into software. An example from Stoa: student A is scheduled for Art 2nd period; when you try to schedule the student for Biology 1st period, you’re told that Biology is a double period class, and as such you’ll need to remove Art from the 2nd period schedule; is that OK? Now checking like this is presently possible, but the Zope 3 framework will allow it to be implemented much more directly and cleanly.

date:2004-03-23 21:10:31
wordpress_id:99
layout:post
slug:zope-3-sprint-day-3
comments:
category:geek

Zope 3 Sprint, Day 2

Yesterday’s sprinting wasn’t quite a visceraly satisfying as the previous two days, but progress was still made. Mark McEahern and myself formed one half of a four man team working on the event subscription and publication system. In particular we began work on Type Based subscriptions, per the proposal.

Our first task was to implement a new ZCML directive, subscriber. The subscriber directive registers an event “listener” (to abuse Java terminology) for a particular event and class. So you can, for example, listen for removal events for all objects implementing IFooBar. Powerful stuff, and of course it requires some refactoring of existing work.

IAddNotifiable and IRemoveNotifiable are two legacy interfaces that were developed before the event system was fully developed. So our second task was to begin refactoring existing code which used these interfaces to use the new subscriber framework. And that’s where we pick up this morning.

I had imagined that PyCon would be an opportunity for me to work on other development as well as Python stuff. Unfortunately, this hasn’t turned out to be the case. I leave the sprint everyday with ideas and the urge to work on them, but my brain just doesn’t want to work in the evening lately. For this reason I’m glad I have two weeks of break from work coming when I return home so I can scratch the myriad of itches PyCon is stimulating.

date:2004-03-23 09:12:52
wordpress_id:98
layout:post
slug:zope-3-sprint-day-2
comments:
category:geek