ccValidator Updated

Looking in the archive, I see that I haven’t blogged about ccValidator in over six months. Because there was nothing to blog about. Today I updated ccValidator with a handful of bug fixes and a nifty make over. The bugs generally surrounded the decoding and detection of Unicode character sets. Several caused the validator to crash, and a couple just made the output, well, ugly. There’s a report floating around that ccValidator stil strips characters of their accents and other markings, but I haven’t had time to track it down yet (and don’t anticipate being able to do so for a bit).

In addition to the bug fixes, the most visible change is that validator.creativecommons.org now uses the same style sheet and layout as the creativecommons.org. A minor (and some would argue irrelevant) change, to be sure, but the difference has been bothering me for a while, so I took great pleasure in crossing that particular task off my list.

If you do find a bug in ccValidator, you can email me, just be sure to include the RDF or URI you’re trying to validate.

date:2005-01-14 11:04:34
wordpress_id:250
layout:post
slug:ccvalidator-updated
comments:
category:ccValidator

validator.creativecommons.org Now Available

I’m pleased to announce that ccValidator now has a new home. validator.creativecommons.org is the new home for the Creative Commons RDF validator. Yergler.net will continue to host development efforts, and I’ll continue to post announcements regarding ccV improvements and service here. Thanks to the Creative Commons for hosting this service going forward.

date:2004-07-12 13:35:43
wordpress_id:151
layout:post
slug:validatorcreativecommonsorg-now-available
comments:
category:ccValidator

Development Update

As I mentioned before, I’ve been hacking on an update of ccValidator. Well, more than an update: a complete refactoring, really. The small item that prompted this was the ability to validate pages whose RDF is specified in a <link> tag instead of an HTML comment. I haven’t even addressed that yet :). But I will; real soon now.

So what’s changed? A lot. First, I’m now using the Quixote framework. This has been a generally positive move: the output.py module that existed previously has been removed, and the HTML generation is a lot less “magic”. I’m also using the new rdfExtract classes (which have, incidently, been rolled in with ccRdf in CVS). Finally, the reorganization around Quixote has allowed the validator to become much more modularized: instead of a bunch of Python that executes as a CGI, there is now a CGI driver and a Python package which can be anywhere in the Python path. Quixote also allows the validator to run under mod_python, FCGI and as a standalone process.

The plan is to have a test instance up by the end of the week, and a long period of coexistance with the existing stable instance. Eventually, though, the new validator will replace the existing one. I haven’t cut a release yet, but the code is available in CVS (the module is now ccvalidator2).

date:2004-04-06 10:19:29
wordpress_id:110
layout:post
slug:development-update
comments:
category:ccValidator

ccValidator Refactoring

It’s been pointed out that ccValidator only supports RDF embedded in an HTML comment, and not any of the other officially sanctioned ways. Most asked for is <LINK> support, which seems to be used quite a bit. I started to refactor the validator today to support LINKed RDF. I’ve talked about it a lot, but this seemed to be the time to also work on cleaning up the code. Right now it’s something of a mess, and really difficult to completely understand. Andrew Kuchling presented on the Quixote form framework at PyCon, which allows you to write a single class which defines, validates and processes an HTML form. That combined with it’s straightforward templating made it an obvious choice.

So I’m currently working on refactoring the validator to use Quixote. The goals that I’m working toward include a cleaner code layout, semi-transparent (or at least more Pythonic) HTML escaping (which Quixote, with the possible addition of Nevow will provide), and support for multiple methods of RDF extraction.

The last item is where much of the work is being invested. The current RDF extraction technique (which uses simple regexes) was borrowed by the Creative Commons Search engine. This simple borrowing demonstrates that there’s a need for a straightforward way to extract RDF from documents, regardless of application. In order to facilitate that in ccValidator and the search engine, I’m working on a pluggable text extraction class, rdfExtract (working title). Hopefully I’ll have a new beta of the validator up later this week which will be easier to maintain and extend.

date:2004-03-30 16:00:53
wordpress_id:108
layout:post
slug:ccvalidator-refactoring
comments:
category:ccValidator

ccValidator 1.3.1 Update Now Available

As promised yesterday, I’ve updated ccValidator with a minor update. The source, as always, is available in the release archive. This release exposes some validation errors which were previously masked as “unknown errors”. These updates provide more verbose error reporting in instances where the RDF is well-formed XML, but not properly structured RDF.

Comments, criticism and suggestions are welcome, as always.

date:2004-02-20 12:05:32
wordpress_id:86
layout:post
slug:ccvalidator-131-update-now-available
comments:
category:ccValidator

ccValidator 1.3 now available

ccValidator 1.3 is now available. It’s running live at yergler.net/projects/ccvalidator, and you can download the release tarball here.

This release is mainly a syncronization release; ccValidator now uses the ccRdf core. Porting ccV to this architecture simplified many areas of the code, and provided an excellent test bed for ccRdf. I found a few bugs, and made a few improvments, so there will be a release of ccRdf soon to finish up the syncronization of work.

image0In addition to ccRdf, ccValidator now sports its own validation image. If RDF parsed from a URL validates properly, you’ll be provided with a bit of HTML to allow you to link to the validation results. Cool, huh? Thanks to Mike L. for the idea.

Thanks for the feedback from all my testers; let me know if you encounter any problems or have any suggestions.

date:2003-12-22 20:39:29
wordpress_id:67
layout:post
slug:ccvalidator-13-now-available
comments:
category:ccValidator

Validator updated; Testers needed

This morning I finished an update to ccValidator. This update is mostly a code cleanup. It finally moves the validator away from the god-awful cclicense.py module (I wrote it; I can say it) to ccRdf. I haven’t tested it extensively yet, so I haven’t updated the production validator yet. That’s where you, dear reader, come in.

I’d love it if you could try it out at http://yergler.net/projects/ccv-cvs. I’d appreciate hearing any feedback about your experience with it. If things are working as expected, you shouldn’t see any difference between the output you get from the test instance and the output you get from the production instance. Thanks for all your help.

date:2003-12-15 11:28:10
wordpress_id:62
layout:post
slug:validator-updated-testers-needed
comments:
category:ccValidator

ccV Results Are Ugly

I was on campus last night and checked my e-mail between classes. One message was a ccValidator bug report, which contained a link to a result page. Since I use Linux at work and Linux/Mac OS X at home, I’d never actually seen my page in Internet Explorer. So I freely confess: the ccValidator results are damn ugly. I’ll have to do something about that.

date:2003-11-06 07:13:48
wordpress_id:31
layout:post
slug:ccv-results-are-ugly
comments:
category:ccValidator

Unicode, Updates, and CVS

ccValidator 1.1.2 is now available here. Fixes include the addition of non-standard Japanese codecs (including Shift_JIS) and further Unicode fixes. I still need to add support for Chinese and other encodings not found in Python’s default codecs package. As always, the current release is available for your validation enjoyment at http://www.yergler.net/projects/ccvalidator.

I’ve also created a CVS repository for my projects, including ccValidator. You can browse it via the web at http://www.yergler.net/cvs. I’m unable to offer pserver access with my webhost, so I’ll be adding a cron job to produce nightly tarballs real soon now.

date:2003-10-27 10:46:53
wordpress_id:23
layout:post
slug:unicode-updates-and-cvs
comments:
category:ccValidator

Magnets are cool; or, Yet another validator update

I’ve been reading the Magnet URI Spec, and the idea behind it’s pretty cool. It’s basically a way to connect documents on the Internet with services provided locally. In the case of the examples given on the website, many of them apply to P2P services. So if I find a song I love and want to share it with a friend, I can e-mail the magnet URI to her and clicking on it will open her P2P app. Cool.

Of course, ccValidator “just happens” to support magnets; if it sees something that looks like a Magnet URN, it constructs a magnet link from it. This leads me to my second point, that I’ve once again updated ccValidator. The changes and fixes are relatively few. Magnets are now constructed to also contain an optional dn, or Display Name, parameter. The display name is extracted from the dc:identifier tag in the work metadata. Also, the validator won’t barf now if it gets an error code when retrieving the URL (404, etc).

Have fun with it, and as always, feedback is welcome.

date:2003-10-24 18:43:34
wordpress_id:20
layout:post
slug:magnets-are-cool-or-yet-another-validator-update
comments:
category:ccValidator