Saturday, January 22

Packrats

It is interesting that people have absolutely no problem filling up latest storage devices despite the rapid advances of data storage technology. This topic has seen some attention from the Slashdot crowd. I do not think that the amount of useful data on a typical computer changes at the same astounding rate. So, where does the space go?

Probably most of the data is multimedia - movies and songs. I wonder how much of those are just played once or twice and then left alone for the rest of time, or until the hard drive dies, whichever comes first. It is none of my business what people do with their multimedia, but I think that it is foolish to hoard things that you will never use.

Movies, in my experience, are a one-off thing. Some do deserve several viewings, but then you do not wait a year before watching them again. I can not think of a good reason not to delete the movie, unless it is one of the few very specials that you actually would want to see after a long period of time passes. Could it be because it "took me a week to download, so it's precious!" Perhaps a friend might want it some time in the future (I can hear the MPAA growling, but I'm not in the USA so I do not care), but then save him/her a few hours and lend a good book instead.

Songs are a slightly different matter. In particular because I have a larger store of them than I would want to (80MB of MIDI from the old times, 5GB of MP3s just locally on my laptop and 350MB of sheet music). I could argue that I do listen to various pieces occasionally. I understand why people are reluctant to delete music. However, I still maintain that collecting for the sake of collection is not a bright thing to do.

Collecting trash is not smart either. Some store all bits of information that they have come in contact with. I say, who cares about the SMS messages, archives of mailing lists, notes, articles that you found "interesting" at the time, artifacts of experiments and primitive one-off scripts. In theory, they could come in useful one day, but in practice they do not by overwhelming proportions.

I suspect that some people put up with the trash just because they are lazy to clean it up. Unorganised files pile up and then in a few years they have trouble finding anything. Well, I think that just like in real life it pays to have a tidy work environment.

The big question is, why am I discussing this? I think that usability and efficiency problems arise out of the sheer amounts of cruft accumulated, and we do not notice. Technical ones too, but that is not important. Database-like file systems are promising, but maybe we could go along with what we have now if we revised our habits a bit. I am not comfortable with the fact that the amount of useless information is increasing so rapidly, and we are battling it by improving searching technology. It would be much better if the signal-to-noise ratio could be improved.

That is a sensible message for developers too: users should be encouraged to throw away what they do not need.

Such an approach of tidying things up does not work on distributed systems, for obvious reasons. That is why we do need good search capabilities and the semantic web. However, in most cases, you are the boss of your computer, so locally you can organize things however you want.

There are practical reasons to keep only useful data around other than searching. Backups are smaller and therefore easier to do, therefore you do them more frequently, therefore your data is safer, QED. Furthermore, there is less risk to accidentally throw away something important while cleaning up junk when you need some extra space. And for me it's a nice feeling that my computer is not a huge pile of virtual trash with the important things buried somewhere inside.

No comments: