Back to this bookmarks thing ..
By Guillaume Mouron on Saturday 10 February 2007, 02:22 - Permalink
I've been thinking again of my bookmarks manager (see my previous post) and came to two different ideas ...
The first solution came during yesterday's night. Use a blogging platform and add a new blog entry for each bookmark, with tags associated ... That might seem a bit "overkill" but it has several advantages that I will detail further.
The second solution is, of course, the DIY solution. But I've been thinking of a specific one I would do. It's not necessarily the one best suited for this task but it would involve some technologies I'd like to work out.
It would be a quite simple xml file, one entry per bookmark, with a <taglist> containing the tags, a <state> tag to mark it as read or unread (very often I keep some bookmarks to read them later, and very often I forget to do so
), probably another one to know if it's a personal bookmark or a professional bookmark (to be able to keep both at the same place, but separated), a field to keep the date and probably one for a title and a last one for a description (which will mostly remain empty as I am to lazzy to fill it :).
Why use an xml file and not a database backend ?
Well, because there are several xml technologies I'd like to test or to reuse. The first one is the RelaxNG schema language. I've already used XSD schemas several times and I've read everywhere that RelaxNG is much simpler to use but does the task as well, so I want to give it a try.
The second one is xslt ... I've already used xslt but I want to continue using it and improve my skills. I would use xslt to build a complete set of pages from my xml bookmark files with corresponding queries when I select a set of tags, counts for each tags, etc etc ...
The third one, linked to the previous point, is I would like to minimize the use of a script language. And if possible, use python instead of php ... I'm not really fond of php but I'll use it if I have to. I would only use the script language to build a small interface for adding a new bookmark ... or maybe, why not, provide a webservice to do so ? With a nice WSDL file
Well, now, let's see the pros and the cons of each solution :
Blog Solution :
+ easy to set up and get it working
+ search integrated (by date, etc etc ...)
- is it possible to get the blog entries combining several tags ? (let's say I have a link tagged with java and "webservice", can I retrieve it by combining those two tags or do I need to browse through all links tagged as "java" or all links tagged as "webservice" ?). It mainly depends on the blog engine but does one with my requirements exist ?
- Need of a manual synchronization with your web browser (or maybe possible with an extension and the xmlrpc interface of the blog ? (I think dotclear and wordpress, at least, have one)
DIY XML Solution :
+ Will fit all my needs
+ Make me work different technologies
+ Possibility to make a firefox extension to synchronize with the firefox bookmark manager ? (maybe by tweaking some existing extension that let you upload on your ftp your bookmarks ?).
- Take time
So, it's still needs some thoughts and maybe some pieces of advice from friends ? 
Comments
I would opt for the second solution, even if I wonder if there is really an optimal solution. As you mention, I also forget to read what i don't read immediatly. My best bookmark tool, it's probably Google : I don't waste time organize them, and I find them quite quickly
Note that "Script as a glue" concept with xml/xslt is exactly what Éric Daspet explained during the FIIFO PHP Forum
It seems to me that you could try to hack something quickly (in Python or PHP), and try to see if it fits your needs. You should be able to get a very simple system in a few hours, in order to experience it before it is even finished, and see what should be changed and what is to stay. Try a very simple page with an AJAX field to search, and a tree using ul & lis.
About XML: I wouldn't think this is a good idea unless you have to share data among applications or services ("Write Once, Read Everywhere"). Using XML for storage kind of misses the point of having a easily-shareable format, even though RelaxNG might be cool to play with.
I guess a few SQL tables should do the trick, as an index is considerably more efficient than looking for info "by hand" in the whole XML document. The del.icio.us extension might have some useful info on how to sync the bookmarks.html file with a web server.
Oh and about tags... This post has only one tag with 3 words. That is exactly what I despise about tags: human beings don't really think with keywords, they have concepts that can be organized (in folders, say), and will very often use different keywords for the same page/concept, thus having trouble organizing data efficiently using this method.
Nicolas, I know the xml solution is not the best one around here, but I really want to play with those xml technologies, so I think I'll stick to that. I don't have in mind another project where I could use them (well actually I have one but I think it's even less a good idea to use xml with it ...) and using xml is not completely crazy here ...
Concerning tags, that was a mistake I made with the dotclear interface ... So first the interface is appearantly not good enough (or at least for me), but, as we discussed, you pointed out a real problem with tags and how we use or misuse them ...
Olivier, I also use google a lot and thus, don't bookmarks a lot of things, but when you found something not straight away from google, and you know that you might have difficulties to find it again, it's always convenient to bookmark them ...
"and will very often use different keywords for the same page/concept, thus having trouble organizing data efficiently using this method.".
Sure but even if the classification is imperfect it is valuable.
If you work in a constraint domain, say IT or medicine, one solution to simplify the problem is to impose a finite set of tags to users. For instance (in IT domain) if a user want to tag the concepts "reverse engineering", "disassembling" or "retro engineering", the only tag provided by the system would be "reverse engineering". Problem : maintaining this list of tag is painful.
Kototama, I think we agree on the imperfection, and you gave a good example of different tags describing roughly the same concept.
A finite list is indeed painful, but an extensible list can be used along with auto-completing fields. This way the user is guided when tagging his items, and can make less mistakes (hopefully).
I don't really understand why anyone would want to use tags instead of, say, nested folder such as (oh!) the bookmarks in a web browser. Tags can be useful when communicating with machines, but as a human being they can be confusing.
I would think the best solution might be to have nested folder as the browser bookmarks, with - why not - tags attached to the folders and links. Also, one could imagine using the del.icio.us API to retrieve them, for example: that would make the search easier and more accurate. Clicking on a folder would display the links, as well as a tag cloud representing the most frequently used tags in this folder. Now that'd be fancy.
"I don't really understand why anyone would want to use tags instead of, say, nested folder such as (oh!) the bookmarks in a web browser."
Perhaps because all concepts can't be organized into a hierarchy. If you take a hard drive would you classify it in the folder "data storage" or "part of my PC"
? probably both.
I lost myself around here so i'll not disturb you any longer. But I find really funny that 4 french speak to each other in English ^^
(quite interesting discussion though)