In Search of Better Bookmarks

Earlier this month while working on a research paper I became frustrated with the tools available to me. I was conducting research in on-line, web-based databases on campus, knowing full well I would need the information I found later that day at home. After collecting my inch thick stack of output from the printer, I wrote two different posts in quick succession describing my frustration and expressing my desire for a better way to manage the information I found. This post is an attempt to clarify that vision and promote the idea of a better way to manage information found on the web. More simply put, a vision for better bookmarks. The Internet has fundamentally changed the way people communicate. You can debate whether we’ve seen the full extent of that change yet, but the fact is our lives have changed. And mostly for the better. Students today are more likely to use Internet based sources for research, as well as use Internet tools such as instant messaging and e-mail for communication. However, this change is not without it’s own problems. People are increasingly relying upon multiple sources of information to form aggregate opinions and stay informed. Additionally, people are faced with mentally organizing information from a vast array of sources. It seems that ten, twenty years ago, people obtained much of their information from books and periodicals. If they needed to recall a fact or story they had previously read, there was a limited number of sources they needed to search. Today, information workers may read dozens of news sites or blogs daily, in addition to any mailing lists they may participate in. This rise in volume means it’s much more difficult to recall and locate individual pieces of information on demand.

The simple fact is this: information workers are faced with a growing volume of information, and a corresponding need to categorize, quantify and assimilate this information for future use. The focus of this discussion is web-based information, which is probably second only to e-mail in quantity. The traditional method for recording relevant web pages, bookmarks, is woefully inadequate for the task today.

Bookmarks, as we have traditionally known them, are inadequate for several reasons. First, there is the manual nature. A method for marking pages as relevant, important, or note-worthy is definitely important, but often users would like to recall information from pages that were on the cusp on the importance threshold: not quite important enough to warrant their own bookmark, but relevant to some task none the less.

Second, and most importantly, is the task of organization. Major web browsers today allow for the creation of folders and sub-folders to organize bookmarks, but the problem remains that this, too, is a manual time-consuming process. It’s important to note two additional failings of this model. First, a bookmark exists in a single place. A political scientist may wish to track information on both national legislatures and copyright law; a web page describing EUCD implementation falls under both categories. True, this user could create two bookmarks, one in each folder, but the semantic difference is important. Second, if a user commits to organizing their bookmarks, the process of maintaining that organization is onerous and time consuming. A tenant of Extreme Programming is that you probably won’t be right the first time. Jim’s Second Law of Engineering states that you can’t solve a problem unless you know the answer. These two principles imply that it is unlikely a user will know the “correct” or optimal filing hierarchy at the start of a project. Only as users work with information and evolve their ideas and “mental map” do they develop this optimal hierarchy. However, given the manual nature of bookmark management, it is not unlikely that sub-optimal organizations persist simply out of inertia.

So what features would improve the usefulness of bookmarks? I can think of three. First, I want them to store more information. More than a URL, I need information (annotations or metadata) that describes what a page means to me. Is it vacation planning, business relocation, or both? This purpose is currently served in a limited fashion by folders. However, I envision a system where not only is the URL and “topic” (folder name) available, but also arbitrary annotations (Wiki-style, anyone?), and possibly even a cached copy of the content.

Second, I need a system that’s portable. It is telling to me that a “feature” of Apple’s dotMac service is the ability to syncronize bookmarks across multiple machines. There’s absolutely no reason I shouldn’t be able to maintain a single set of “bookmarks” at school, at home and at work. Whether I store them on a web server or USB memory is irrelevant; the information needs to be portable. Note that this also ties in with my first point: if I’m working on a paper at school and want to use information from a database that’s only available to computers at school, the ability to transport my “bookmarks”, annotations and a cached copy of the page home in one package is very valuable to me. It opens up the list of places where I can work with that information, and frees me to work where I want to.

Finally, the system needs to be integrated. A comment to one of my previous posts suggested that what I really want is XiTouch, which the commenter bills as a “web based PIM/blogger.” No, that’s not what I want. I want my browser to be smarter, not some web service where I have to type in notes. I’m not saying XiTouch isn’t useful, but looking at the web page, I know it’s not for me. To me, the advantage to searching for “better bookmarks” instead of an electronic Notepad is that bookmarks are integrated, first-class (well, that may be arguable) citizens of the browser world. And by making them better or smarter, we can enable better use of information.

I want my “bookmark” information to be smarter, portable and completely integrated with my browser. Shouldn’t it be possible to pull a USB memory device from my machine at work, drive home, plug it in and start where I was? Or hop on a plane, plug it into my laptop (at the appropriate altitude, of course), and work with the cached content? These two use cases really highlight what I consider to be “optimal” uses.

I don’t think I’m alone in identifying this as a problem. I haven’t done an exhaustive search, by any means, but there are some projects underway which I believe share at least the principles I’ve outlined here. In the category of information management, both Chandler and Haystack come to mind. Chandler, while written in Python and cross-platform, is very e-mail and PIM centric. I like what they’re doing, but it fails the integrated test. Haystack, well, it’s a research project, and as cool as it may be, waiting 15 minutes for my app to start really doesn’t fly for me.

I also like Dashboard, but I think it’s trying to solve slightly different problem. Instead of helping organize incoming information, Dashboard tries to show you information relevant to your current task, which is definitly cool. While Dashboard is Gnome only, it is written in C# (Mono), so I hold out hope that it could run on other platforms at some point.

Finally, TrailBlazer is a Mac OS X browser put together by the UIUC chapter of ACM. TrailBlazer builds on Apple’s WebCore framework, so it inherits the excellent rendering engine from Safari. Of course, that also means it’s Mac OS X specific, a killer for me since I use Linux at work, Mac OS X at home, and Win32 at school. Even with that “problem” it’s worth examining because it introduces a new paradigm for browsing history in the context of paths instead of just a list of pages.

So now that I’ve recorded my opinion for posterity, what’s next? I’d like to know if I’m way off base or right on or somewhere in between. Is this really a problem, or just the perceptions of a psuedo-crack-pot? And if it is a problem, does anyone have any ideas that I’ve overlooked? “Must-have” features that will make life better? As far as implementation goes, I lean towards implementing this as a Mozilla extension. I have some experience with them, and it seems like the only way I know of to satisfy the “cross-platform” and “integrated” litmus tests. Even with the Gecko platform to build on, this may prove to be a larger task than I expect, but I’m willing to take that chance. Anyone else?

date:2004-04-30 10:05:07

License Tagging

In a previous post I mentioned I’ve been doing some contract work for the Creative Commons lately. I just uploaded an update to that and wanted to fill everyone in on what I’ve been doing. One of the unanswered tech challenges was the creation of a GUI for embedding CC license claims. A couple weeks ago Mike contacted me and asked me to work on this, which I happily agreed to do.

The result is ccTag-gui, a cross-platform wxPython based GUI application for embedding license claims and generating the cooresponding validation RDF. Right now it’s pretty simple: it only supports MP3 (although adding Ogg support should be trivial) and only generates the RDF for copy and paste. You can actually try it out, if you like. “Releases” (and I use that term loosely) are available here. There’s a Win32 installer, a Mac OS X disk image, and a slew of RPMs. The RPMs are the least tested, and they don’t enforce the wxPython dependency (you need at least 2.5.1). The Win32 and Mac OS X packages, on the other hand, are completely self-contained.

Even though the tool is really only a “technology validation” prototype right now, developing it has been a good experience for a couple of reasons. First, I’ve been reminded of wxPython, and just how good it is (and is getting). Second, it’s been an excellent exercise to write an app that has to be cross-platform from day 1. While Python is better than some languages (cough, Java, cough) at allowing true cross-platform development, there’s a real difference in work style between writing an app and then figuring out what it will take to make it work on another platform, and writing an app and testing on 3 platforms from the start. I like to think the resulting code has fewer bugs and better design, but I don’t have any proof for that.

In conclusion I’m pretty happy with the way it’s turned out, and if you have any suggestions, please let me know.

date:2004-04-29 12:32:14

Learning From The Best

In reference to my tirade on a unified blogging/research/information organization tool, here’s what I currently use, love and wish could be tied together:

  • Wiki. Need I say more?
  • SubEthaEdit : shared, real-time editting; how can you have a collaboration tool without it?
  • CVS / Subversion : versions. Lots of them. Because if there’s a way, I’ll screw it up.
  • Atom : “feed”-ing the future (OK, even I think it’s a horrible pun and I wrote it)
  • : what good is information organization if you can’t use it easily?
  • Creative Commons : what tool would be complete without the ability to embed, detect and “understand” the future of the public domain?

I’m still wrapping up some school work this week before finals, but as soon as I have a free minute, I’ll put together a more coherent document. Stay tuned.

date:2004-04-22 17:07:10

mozCC 0.8.0 Preview

I’ve been working on a much-need update to mozCC, and wanted to give users a preview of what I’m working on. MozCC 0.8.0 is supposed to be (to my way of thinking) a mostly feature-complete, cleaned up version of the existing mozCC. Improvements will include:

  • localization support (done)
  • better RDF extraction (done)
  • improved details interface (in progress)

|image0|It’s the final item I want to share today. In previous version of mozCC, clicking on the status bar icons or toolbar icon presented the user with a details dialog. This dialog wasn’t all that useful; the formatting was obtuse and I’m not sure it really made it clear to people just what they could do with the work. The new dialog aims to improve that. It presents a list of licenses and works defined on the page. Each list has human readable text describing either a user’s right(s), or details about the work. Finally, there’s a “this page” section, which describes what license applies to the particular page you’re looking at. The screenshot to the right shows the license tab open, viewing my blog earlier this morning.

As always, feedback and suggestions are welcome.

date:2004-04-19 11:52:38

A Clarification Regarding Java

In a previous post I wrote about how RMS’s essay on writing free software for Java struck a chord with me. In summary, RMS states that writing free software in Java that uses JRE features not available in free software Java implementations (such as the GNU Classpath) reduces the actual freedom of that software. I drew a parallel between non-free JRE features and Mac OS X-specific free software. It seems to me that free software, written to target Mac OS X only, is not really free. I consider myself a pragmatist about many things, and I think this view fits: I use both Linux and Mac OS X and it’s frustrating when I can’t use the same software in both places.

OK, on to the point: I was thinking about this again in the shower this morning (I’m not really sure why), and I realized there is a pragmatic difference I failed to mention previously. I don’t code in Java regularly, mostly because I just have never had a compelling need. However, it does seem to me that there is a pragmatic difference between free software “constrained” by non-Free, zero cost software and that constrained by non-Free, costly software. That is, does that fact that Sun and IBM give away their respective JREs make the constraint more palatable than the constraint of Mac OS X, which as we all know costs money? Does, or should, the question of monetary cost enter into the equation of how Free software is?

date:2004-04-19 09:25:28

Hello, Brothers; Goodbye, Liver

I spent the majority of the weekend in Lafayette with my fraternity brothers. Friday night the chapter held an alumni appreciation dinner. It was a casual, informal affair, and a nice opportunity to reconnect with the active brothers and find out what’s going on in our lives. Stephen, the chapter alumni liaison (proposed re-election platform: “I’m the one that can spell liaison properly!”), is to be commended for his work; as jaded as it may sound, I feel better opening my checkbook when I know I’m appreciated.

I spent a couple hours Saturday morning taking pictures for my photojournalism class. I ran into Rob Mate, the chapter faculty advisor and my former mentor, on campus. He was unable to attend the initiation later on Saturday, so I was glad that I was able to spend some time catching up with him. Rob is one of those people who even at his worse seems like someone I aspire to be.

Saturday evening was intiation, and then the after party. Luckily it was held at Steven’s, where I was staying. I think it was Brittany who once said “I love hosting the party; it’s so much easier to stumble home at the end of the night.” All in all it was a fun night. To quote Steven’s summary e-mail from the next day:

“Fatalities and mis-haps of the night: One burning bush, a ruined stereo, several glass bottles off the balcony, an indescribable amount of beer spilled in my bed and all over the living room floor, and my neighbors baring their cocks to Jason Glassburn. I told you girls, all it takes is a few of my shots and off come the pants.”

Yes, we burned a bush. Who knew cigarette butts could light brush on fire? And yes, Steven’s bed was covered in beer by an alumnus who shall remain nameless (cough Randy cough). And to quote Steven’s neighbor: “I’m progressive! I’m open-minded! It’s just a cock!” Right.

date:2004-04-19 09:12:59
category:my life

Random Life Notes

Just a couple of things I’ve been thinking about lately that haven’t been serious enough to warrant their own entry.

First, Garrett and I are playing volleyball in a city league right now. For those who know me, this committment to a sporting activity is probably suprising, and with good reason. Of course, we suck. Really. I was so excited on Wednesday night: we won one out of four games, and it wasn’t a forfeit! Who knew it was possible. I’m actually sort of enjoying it, so much so that we’ve signed up for the summer beach league. I feel so butch.

On a completely unrelated note, the past week has been rather fun for me, since I’m getting paid to do what I love. Mike from the Creative Commons contacted me a while ago about doing some contract development. It’s not that I don’t love my day job, but rather that it’s nice to do work you know is appreciated and you know is worthwhile. Some days there’s just not enough positive attitude in the world to make diagnosing idiot problems with golf software fulfilling.

Finally, this weekend my home chapter is hosting initiation. We’re initiating the Rho-Glassburn class, named for my big brother and mentor, Jason D. Glassburn. I had a good time at initiation last semester, and hopefully this weekend will be the same (although I’m trying to cut down on the binge drinking). In addition to the standard ritual and after party, the active brothers have organized an alumni appreciation dinner this evening. I already feel appreciated.

date:2004-04-16 14:56:06
category:my life

Java and Mac OS X: not exactly Apples and, er, Oranges

In response to the ever-growing debate about Java and whether it should be open sourced, Newsforge is running a pair of “point, counter point” articles. Well, not exactly. The boring one is a JavaLobby position piece (or so it seems). The more interesting one is penned by RMS, and questions how free software can really be if it relies on (nay, requires) non-free dependencies such as the Sun of IBM JRE. I don’t always agree with RMS. In fact, it’s possible I disagree with him more often than not. However, in this situation he does an excellent job of articulating something that’s been bouncing around in my head lately. But not about Java. One of the more interesting session at PyCon was Bob Ippolito’s 60 minutes of MacPython. I attended because I love the Mac OS X interface, and I definitly feel more productive and “at home” on Mac OS X than on any Win32 interface. During his presentation Bob talked about the different GUI toolkits available for Python on Mac OS X. These included Tkinter, PyObj-C, and wxPython. Of all these, Bob’s recommendation was PyObj-C. When asked by an audience member which he would recommend for building cross-platform interfaces, Bob responded (wrongly, in my opinion) that the Mac OS X interface has unique paradigms that don’t translate well to other platforms, and therefore you should use PyObj-C for Mac OS X and something else for any other [inferior] platform you might want to support. I paraphrasing, but I think I’ve got it mostly right. This is where RMS’s argument about Java comes in.

If I want to write Free Software (and I do) and I want to to run on any machine I use (and I do), then I have to support Win32, Mac OS X and Linux. And writing two (or three!) GUIs isn’t going to work. I don’t have that kind of time. Would my interface be better if I wrote it for each platform individually? Possibly. But I don’t think that it’s a certainty that those “unique paradigms” would translate directly into increased usability. Additionally, as RMS points out, can I call my software Free if it relies on a non-Free kernel or operating system?

After seeing Bob’s demonstration of PyObj-C, I’m impressed. It’s obviously powerful and I can’t argue that the ability to use Apple’s development tools is a real boost. The website mentions that many of the unit tests pass on GNUstep, but Bob was frank in admitting that it’s not presently possible to just recompile for GNUstep and have your app work as expected. When that happens, maybe PyObj-C and GNUstep, together, will unseat wxPython as my GUI toolkit of choice.

date:2004-04-13 10:07:26

Better Blogging, or, “It’s more than just a blog, Virginia”

This morning I finally found (took?) the time to record my thoughts on how I blog and what’s wrong with how I blog. My general conclusion, for the sake of brevity, is that current blogging tools are difficult to use because they don’t fit with my typical workflow. Blogging is, for me, a way of recording thoughts, ideas and notes and sharing those in a psuedo-collaborative way. The requirement of an additional tool (in my case the excellent Movable Type web interface) is really a huge barrier to the actual act of writing and blogging. In short, blogging should be an extension of what I already do, and should integrate seamlessly with my current tools.

Since writing that post the thought of what would make a good blogging tool has been floating around in my head. What I’m realizing is that I need more than just a convenient entry point into the Movable Type interface. If that was all I needed, then the Webpanel Extender for Firefox would solve the problem nicely. If I only needed a generic XMLRPC client for a blogging tool, then mozBlog would fill the bill. What I really need is a web browser that learns about me and allows me to push information out to different locations on demand. Re-reading the previous sentence, even I’m not sure what it means, but I like it so it stays. Consider the following scenario, which prompted this entry: I’m working on a paper for my Politics of the European Union class at college. My topic is EU copyright regulation, and I’m at the university library working on some basic research. The method I find most effective for conducting research is to cast as wide a net as possible, and then slowly sift through the resulting information, discarding that which is irrelevant and keeping that which applies to my progressively narrower thesis. Working with online databases at the school library, I’ll print out journal articles and web pages, mark them up later this evening, and end up shredding half of what I print. This works relatively well for me, and allows me to follow the iterative pattern of research and writing that I’ve developed over time. But today, as I gathered my half-inch stack of output from the printer, I realized that if I could apply the same ideas I have for blogging to research, I could dramatically improve my writing process.

Consider for a moment the following revision of this scenario: At the university library, I log onto my account, start up Firefox and log into the academic journal databases. As I perform my search, I have a sidebar open where I can drag URLs, click a button to take a “snapshot” of a page, make short notes to myself and assign annotations and keywords to snippets of information I cut and paste from the open page. Each piece of information retains it’s source URL, history that got me there and access time: basically, everything I might need to construct a bibliography entry for that piece of information. Tonite, at home, I fire up my browser, and review the information. Data that fits with my continuously refined thesis is sorted into one “pile” and that which is extraneous is filed elsewhere. Even later, when I go to actually write my paper, I have the ability to cut and paste text from articles instead of typing the quotes into my word processor. In short, my research process, streamlined and improved.

For those of you still paying attention, you may be wondering what this has to do with the title, which mentioned blogging. Everything, I say. To my way of thinking, a tool like this that integrates into our existing applications empowers not just blogging or research, but all sorts of building on and improvment of existing work. The fact that in some cases I publish the resulting annotation as a small text file which PyBlosxom picks up instead of keeping it in a private store is only a semantic difference: the important (and exciting, to me) thing is that all the information that passes through my browser daily becomes available to me at a later date. Both blogging and academic research (and other tasks, I’m guessing) are really just acts of remixing existing works and ideas.

Sure, the online database has a “feature” which allows you to log in and create what they refer to as “persistent searches”. But how many web service accounts do I really need? At what point do I decide that keeping windows open for EBSCO, Lexis-Nexis and ACM Digital library, just so I can cut and paste, is too much? At what point does the overhead become overwhelming? And why should I keep each database hermetically sealed? Isn’t there some validity to the idea that remixing information from the ACM and Lexis-Nexis could yield interesting results? Surely there’s some intersection of the two. And that intersection is best served by a tool which empowers readers, writers and researchers, instead of restricting them.

Bookmarks? Ha! Those are so 2003.

date:2004-04-12 16:28:06

Blogging is Hard

Blogging is hard, but it shouldn’t be. “How hard can it be to write your rants, thoughts and ideas?” I hear you protesting. But it is hard, or at least harder than it should be. Many times during the day I think, “this is cool; I should blog this.” But when I sit down at my computer, or complete the task I’m working on, I don’t have the time or motivation to open a browser, navigate to the Movable Type interface, and type the events, ideas,and/or commentary in a coherent fashion. This entry? Been thinking about it for a good 48 hours. The result is that I blog less, although possibly each entry receives more thought before being posted. Somehow, though, I’m not convinced that the extra thought is really such a good thing. In fact, I am convinced that there are lots of good ideas in my head that never make it onto the web because of this barrier. In examining this issue, I think of two things: first, why do I blog, and second, how can it be any easier? So first, why do I blog. I blog for selfish reasons: I want a permanent record of my thoughts, actions and ideas. I also want feedback, criticism and suggestions. In short, I blog because my voice matters, and because blogging is a way for me to collaborate with people I’ve never seen or met. It helps connect me to a larger community that I enjoy being part of.

So how can I make it easier? What “use case” isn’t being satisfied? There are a couple situations I frequently encounter that I wish could be handled more transparently.

First, when I run across a web page or blog entry that either strikes me as insightful or that contains a nugget of information I want to retain. If it’s insightful, I often want to add my voice to the fray; +1, as it were. I don’t always have a pithy comment or addition to make, I just want to reinforce the idea. The pages with some long-sought nugget of information usually get bookmarked, which is really the worst model for me to use: I use several machines, don’t sync my bookmarks, and often have to repeat the search process many times through the life of a project. So “blogging” those pages embeds them in my own space on the web (another selfish application of blogging, I guess).

The second situation I frequently encounter is the desire to post a status update or progress report on an open source project. I work on a couple of open source projects, and because my time is often spread thin, I’d like to post something to my blog when I make an update or pay attention to a particular bug. I’m not vain enough to believe that users are obsessively clicking reload waiting for an update to mozCC, but I do know from my e-mail coorespondance that people like to know what’s going on. I like to know what’s going on. But the updates don’t often warrant a full-fledged blog entry, and I don’t feel like I can write enough to make it warrant the effort it takes. It should be easier.

So what can be done to make blogging easier? I don’t believe I’m the only one who feels this way (although I suppose it’s possible), so it seems like there should be some pent up desire. These are the ideas I’ve come up with, in no particular order. I don’t know when I’ll get around to working on them, but I’m really interested in hearing if people have the same problem or if they have other suggestions.

  • a blog manager/posting side-bar for Mozilla and Firefox; if it were right in my browser, it might be easier to work with. I think someone suggested this for Movable Type at one point, but I haven’t seen it implemented. Ideally it would use something like the Atom API so it would work with more than just a single tool.
  • a way to quickly +1 a page and have it reflected in my blog; maybe a “hey, this is cool” mini-blog sidebar like Ernie has?
  • a way to easily post status reports, work logs, etc, similar to .plan files back in the day; or maybe .plan meets IM status messages
  • a better way to collaborate on documents. This doesn’t fall completely into this category, but I’ve been thinking about it a lot since reading the A Manifesto for Collaborative Tools in Dr. Dobb’s this month.

Wow. I figured there’d be a lot more items in that list, but the primary one is the first: integrate blogging with a tool I already use. Whether that’s my web browser or e-mail client, if posting to my blog were a first-class software citizen, I think I’d do it more. What about you?

date:2004-04-12 11:50:04