Not following tags

“Do not credit this link” is a useful assertion that cannot be gleaned from surrounding content.

Thus, rel="nofollow" is a good if old idea. At least one of my two search predictions for 2005 is already coming true.

Creator assigned keywords or “tags” on the other hand, strike me as a contemporary implementation of HTML meta description tags, which failed because they placed a burden on good webmasters (classification is hard) and presented an open field for spammers, who tag[ged] their pages making a hard sell for whatever with completely unrelated keywords.

Global classification strikes me as a case in which Google is right — metadata inferred from content beats explicit, manual metadata when it comes to categorization. From the Peter Norvig (Google Director of Search Quality) interview I cited:

This is a Google News page from last night, and what we’ve done here is apply clustering technology to put the news stories together in categories, so you see the top story there about Blair, and there’re 658 related stories that we’ve clustered together.

Now imagine what it would be like if instead of using our algorithms we relied on the news suppliers to put in all the right metadata and label their stories the way they wanted to. “Is my story a story that’s going to be buried on page 20, or is it a top story? I’ll put my metadata in. Are the people I’m talking about terrorists or freedom fighters? What’s the definition of patriot? What’s the definition of marriage?”

Folksonomies are great in limited domains, thus far most famously for organizing and sharing bookmarks (decentralize using same technology as Technorati’s self-tagging) and organizing photos.

Keyword tagging is also a lightweight way to provide navigation for a website. I might categorize more posts on this weblog if I could do so in a similarly lightweight manner (now I have to create categories via an interface separate from posting). Haven’t I come right back to the creator-assigned keywords that I criticized above? No, there’s a subtle but very important difference: metadata as a side effect of useful work versus metadata as spammy make work.

2 Responses

  1. […] The second reason I link to Wikipedia preferentially2 is that Wikipedia article URLs conveniently serve as “tags, as specified by the rel=”tag” microformat. If Technorati and its competitors happen to index this blog this month, it will show up in their tag-based searches, the names of the various Wikipedia articles I’ve linked to serving to name tags. I’ve never been enthusiastic about the overall utility of author applied tags, but I figure linking to Wikipedia is not as bad as linking to a taggergator. […]

  2. […] Not following tags tries to have it n-ways but is willfully confused: “metadata as a side effect of useful work versus metadata as spammy make work.” Tagspam has utility for web publishers; it is categorization for navigation that is useless make-work: recency and search rule, metadata is crap. […]

Leave a Reply