leah blogs

April 2005

17apr2005 · A Tagged Filesystem

I often notice that computer users, mostly the ones of the novice kind, have trouble understanding the filesystem, organizing and therefore (re)finding their files.

Usually, this results in the user saving the file into whatever the current path of the “Save…” dialog is, alternatively stuffing it all into one large directory…

Tags to the rescue! We can build a Tagged File System, which doesn’t have a hierarchy, but only tags that can be attached to files. This would even be possible to represent in Unix, provided you can alter the file dialogs of your applications: All files get saved into a hidden directory, .everything. By tagging a file, it will be hardlinked into a directory with the name of the tag. Now you can simply “copy”, “move” and “delete” the file, thereby only changing tags. To unlink the file, a tool would need to look into .everything for files that don’t have a link in a tag folder. (Actually, you can only use tag folders, and no .everything, but this may be a bad idea, read on.)

The problem now is that because all files reside in .everything, they all need to have a different basename. I first played with the idea of moving the complete system into a Tagged File System, but then I analyzed my disks: My root directory (Mac OS X 10.3.8 installation with a big home and lots of cra^Wstuff installed) has 603484 files, and there are 81498 basename clashes, every seventh file clashes. Additionally, there are 17807 different parts of directory names used. That would be 17807 tags!

When I reduced the analysis to my home directory, it still was 25092 clashes in 204372 files, every eigth file (mostly due to files like Makefile, COPYING, info.nib that can be found in developer’s homes).

Of course, I’m not the target user of this, but these results nevertheless tell me that one better only uses Tagged Filesystems for directories like “Documents” or “Music” (assuming your files aren’t called 01.ogg, 02.ogg…). In these kinds of directories, name clashes are rather rare, so here tagging can fully pay out.

One very nifty thing would be implementing the Tagged Filesystem using LUFS or Hurd’s filesystem translators, so you could do stuff like (assuming the Tagged Filesystem is mounted at ~/music.)

ls ~/music/blues/clapton

very easily. By the way, above would be the same as

ls ~/music/clapton/blues

of course! One may want to invent a syntax to implement negation too, so you could do

ls ~/music/-clapton/blues

to show all music files tagged as Blues not by Eric Clapton. Another nice thing would be to have computed tags, like “Files created last week”, “Files changed after last backup” and so on.

I think Tagged Filesystems could help the average user lots, and still be downward compatible enough to classic, hierarchical filesystems to stay accessible within the shell. Of course, this requires OS and application developers to actually implement them, and making them so easy and natural to use that “average” people will actually use them.

NP: Eric Clapton & B.B. King—Worried Life Blues

Copyright © 2004–2018