“Do not credit this link” is a useful assertion that cannot be gleaned from surrounding content.
Thus, rel="nofollow"
is a good if old idea. At least one of my two search predictions for 2005 is already coming true.
Creator assigned keywords or “tags” on the other hand, strike me as a contemporary implementation of HTML meta description tags, which failed because they placed a burden on good webmasters (classification is hard) and presented an open field for spammers, who tag[ged] their pages making a hard sell for whatever with completely unrelated keywords.
Global classification strikes me as a case in which Google is right — metadata inferred from content beats explicit, manual metadata when it comes to categorization. From the Peter Norvig (Google Director of Search Quality) interview I cited:
This is a Google News page from last night, and what we’ve done here is apply clustering technology to put the news stories together in categories, so you see the top story there about Blair, and there’re 658 related stories that we’ve clustered together.
Now imagine what it would be like if instead of using our algorithms we relied on the news suppliers to put in all the right metadata and label their stories the way they wanted to. “Is my story a story that’s going to be buried on page 20, or is it a top story? I’ll put my metadata in. Are the people I’m talking about terrorists or freedom fighters? What’s the definition of patriot? What’s the definition of marriage?”
Folksonomies are great in limited domains, thus far most famously for organizing and sharing bookmarks (decentralize using same technology as Technorati’s self-tagging) and organizing photos.
Keyword tagging is also a lightweight way to provide navigation for a website. I might categorize more posts on this weblog if I could do so in a similarly lightweight manner (now I have to create categories via an interface separate from posting). Haven’t I come right back to the creator-assigned keywords that I criticized above? No, there’s a subtle but very important difference: metadata as a side effect of useful work versus metadata as spammy make work.