Post Blogs

Not following tags

Thursday, January 20th, 2005

“Do not credit this link” is a useful assertion that cannot be gleaned from surrounding content.

Thus, rel="nofollow" is a good if old idea. At least one of my two search predictions for 2005 is already coming true.

Creator assigned keywords or “tags” on the other hand, strike me as a contemporary implementation of HTML meta description tags, which failed because they placed a burden on good webmasters (classification is hard) and presented an open field for spammers, who tag[ged] their pages making a hard sell for whatever with completely unrelated keywords.

Global classification strikes me as a case in which Google is right — metadata inferred from content beats explicit, manual metadata when it comes to categorization. From the Peter Norvig (Google Director of Search Quality) interview I cited:

This is a Google News page from last night, and what we’ve done here is apply clustering technology to put the news stories together in categories, so you see the top story there about Blair, and there’re 658 related stories that we’ve clustered together.

Now imagine what it would be like if instead of using our algorithms we relied on the news suppliers to put in all the right metadata and label their stories the way they wanted to. “Is my story a story that’s going to be buried on page 20, or is it a top story? I’ll put my metadata in. Are the people I’m talking about terrorists or freedom fighters? What’s the definition of patriot? What’s the definition of marriage?”

Folksonomies are great in limited domains, thus far most famously for organizing and sharing bookmarks (decentralize using same technology as Technorati’s self-tagging) and organizing photos.

Keyword tagging is also a lightweight way to provide navigation for a website. I might categorize more posts on this weblog if I could do so in a similarly lightweight manner (now I have to create categories via an interface separate from posting). Haven’t I come right back to the creator-assigned keywords that I criticized above? No, there’s a subtle but very important difference: metadata as a side effect of useful work versus metadata as spammy make work.

N-level blog entry references

Monday, December 27th, 2004

Dear LazyWeb,

Bloglines, Technorati and probably others do a passable job of presenting direct references to a blog entry. (Minor complaints: With Bloglines you have to subscribe to a feed or preview with a “siteid” internal to Bloglines; if your blog has multiple duplicative feeds (e.g., rdf/rss2/atom) direct entry references only appear for one of the feeds; Bloglines makes no attempt to consolidate or allow feed owners to consolidate; Technorati appears to use screen scraping and picks up some garbage along the way.)

So here’s my LazyWeb request:

I want to know, without lots of extra clicking, not just resources that directly cite blog entry A, but resources that cite resources that cite blog entry A, and so on. In the context of Bloglines, instead of “2 references” I want to see “2 direct references, 3 level1 indirect references, 1 level2 indirect reference”, “6 references, 2 direct” or similar, and I want to be able to see all references, direct and indirect, on a single page. In the context of Technorati, indirect references could optionally be part of a “watchlist” feed.

Will Bloglines, Technorati, or some up and coming aggregation service please do this?

I’m not terribly interested in visualization of social networks implied by blogs (blogosphere visualization, blogversation maps?) or even blogthread visualization and the like here. Neat, but too heavyweight to use daily. I just want a small feature increment.

Speculate on Creators

Wednesday, November 17th, 2004

Alex Tabarrok writes about An Auction Market for Journal Articles (PDF). Publishers bid for the right to publish a paper. The amount of the winning bid is divided by the authors and publishers of papers cited by the paper just auctioned. Unless I’m missing something all participating journals taken together lose money unless the share of cited authors is zero and transaction costs are nil. Still, the system could increase incentives to publish quality papers, where “subsequent authors will want to cite this” is a proxy for quality.

I’m reminded a tiny bit of BlogShares (“Blogs are valued by their incoming links and add value to other blogs by linking to them”), but especially of Ian Clarke‘s FairShare, which is a proposal for speculative donations:

Anybody can “invest” in an artist, and if that artist goes on to be a success, then the person is reward in proportion to their investment and how early they made it. But where does this return on investment come from? The answer is that it comes from subsequent investors. For example, lets say that you invest $10. $4.50 might go straight to the band, $1 might go to the operator of the system, and the remaining $4.50 would be distributed among previous investors in the band, those who invested more early would get a bigger proportion than those who invested less, later-on. Of course, most people will not make a profit, but they are rewarded by knowing that they contributed towards an artist that they liked, and helped reward others who believed in that artist, and who may have brought the artist to their attention.

Under FairShare participating creators taken together and individually would make money, as payments are from without the system, driven by the generosity and greed of fans and speculators.

A system in the spirit of one or both of these proposals could perhaps help fund a voluntary collective licensing scheme of the sort contemplated for digital music, but conceivably applicable to other types of work.

If the journal market idea really could foster a self-sustaining business model it could be a boon to the open access movement. Restricting access is rather pointless when your main business concern is to get your articles cited.

I’ve rambled about open access models elsewhere.

Invitation Marketing: Six Gmail Shills Available

Thursday, September 2nd, 2004

Consulting firm Accenture has a paper called Invitation Marketing: Using Customer Preferences to Overcome Ad Avoidance. While the paper paints in broad strokes, it is clear that Google has implemented a variation with great (Orkut) and even greater (Gmail) success.

How many otherwise respectable folk have you seen dedicating email broadcasts and blog entries to announcing that they have a few Gmail invites to give away, especially in the last couple weeks? I lost count long ago. Ad avoidance overcome, indeed.

Kudos to Google’s marketing department.

Updates: Wendy Seltzer cited this post: Gmail’s Viral Marketing. My trackback broke. Oops.

I like Joey Hess‘s take on the Gmail invite virus: stop wasting my time with gmail. Joey notes that the going price on eBay for both Gmail invites and ancient 1 gigabyte hard drives is less than one dollar.

Sloths and Their Slothfulness

Tuesday, June 8th, 2004

Via Elizabeth Rader I discovered Kairosnews criticizing the Creative Commons weblog and others for using non-free weblog software. The CC weblog currently uses the “lars-blogger” package for OpenACS, both GPL.

I would’ve posted a comment to Kairosnews, but that would’ve required registering and logging in. Trackback is great for sloths.

Sort of apropos: I didn’t switch to WordPress, but I did delay starting a public blog for ages while waiting for simple libre blog software that supports pretty URLs, comments, trackbacks, pings, syndication, etc. Other reason for delay: slothfulness.

Will weblog software will disappear as a category? I want to manage an entire site with one application (up til now: “vi”, more or less). It isn’t hard for a CMS to include a nice weblog feature. It is kind of a pain for users to force weblog applications to serve as a whole-site CMS, though many people do that.

Creative Commons 2.0 Licenses

Tuesday, May 25th, 2004

Turned on last night. See the CC weblog for a thorough explanation of the versioning.

No upgrade required here: this weblog is dedicated to the public domain.

Client-side remixing isn’t so loopy

Saturday, March 13th, 2004

Lucas Gonze’s analysis of client-side remixing is spot on. Summary: client-side remixing is to precise syncrhonization as HTML is to precise layout. If you don’t need precision, enjoy.

I see three limits to client-side remixing. All can be raised:

  • Bad client software. It either doesn’t work or barely works and you need a very keen eye to find a gratis download amongst enticements to buy a super-premium subscription version (cf RealPlayer).
  • Lack of expressivity. Remixers don’t just overlay source segments, they also apply various effects to the same.
  • Streaming-like experience. In order to obtain a smooth client-side remix playback you (actually your client, this is a subset of “bad client software”) will have to download most of the needed source content first. I often have a bad experience with playing-while-downloading of individual songs and videos over the net, nevermind many coordinated sources.

I suspect that with excellent client software the client-side remix experience could be very good. Lack of expressivity seems like the toughest hurdle to me. However, if said excellent client software can download and run code safely … effectlets?

Video games seem like a highly constrained example of what client-side remixing could do. They pull off co-ordinating lots of different source media (sometimes all local, but that’s beside the point) with code quite well, versus hardcoding different sources into a single stream at the point of production.

However, anytime in the near future using client-side remixing to evade those who would prevent distribution of The Grey Album and the like is pointless. Client-side remixing isn’t up to the task, and you can still download the album from the web after weeks of brouhaha, nevermind P2P networks.

Memory augmentation: cc-metadata client-side remixing [1] [2]

REGISTER NOW. IT’S FREE AND IT’S REQUIRED.

Thursday, February 26th, 2004

Experimenting with vote links:

Goodbye, WP, join the LAT in the infinite unread bin.